Waiting on a file: poll_fd and RawFD


#1

Hello,
I’ve found a few different ways to wait on a file until data is available to read. Of the methods I’ve found, poll_fd seems to match my use case well, but I worry that it requires the use of RawFD, which only appears in the docs once so I’m not sure I’m using it right or should be using it. f=open("test","w");poll_fd(RawFD(fd(f)),1;readable=true) seems to work (0.5.0 on MacOS), is there anything wrong with that usage?

This seems like a good topic for an issue/pr related to improved docs, agreed?

Thanks

PS: For what it’s worth, here are the other methods I found:

  • watch_file("test", 1)
  • f=open("test","w");wait(RawFD(fd(f));readable=true)
  • f=open("test","w");read(f,10;all=true)
    • the docs says it “If all is true (the default), this function will block repeatedly trying to read all requested bytes, until an error or end-of-file occurs. Note that not all stream types support the all option.” This call returns immediately with an empty array, so I guess files on MacOS don’t support the all option?

#2

Well it seems maybe I am doing something wrong, the following code reliable gives a bunch of libuv errors then kills julia on my machine:

names = [tempname() for i=1:20]
fs = [open(name,"w") for name in names]
fs2 = [open(name,"r") for name in names]
f=fs[1]

function write_a_lot(f)
  while true
    sleep(2)
    write(f,UInt8(0))
    flush(f)
  end
end
writers = [@schedule write_a_lot(f) for f in fs]

function wait_a_lot(f)
  while true
    read(f)
    poll_fd(RawFD(fd(f)),0.08;readable=true)
  end
end
readers = [@schedule wait_a_lot(f) for f in fs2]

gist with code and output

Any ideas? Otherwise I’ll just use watch_file for now, which seems to work, but seems like it should be slightly less efficient since it parses the filename every call and creates new file descriptors.


#3

It sounds like you are actually looking for Pipe() (and the not-terribly obvious Base.link_pipe function for opening them)? The linux design of (e)poll is rather rough. Julia tries to work around some of the issues with the kernel interface, but file descriptors pointing to files aren’t entirely valid as targets of poll_fd, just as a fair warning.

This call returns immediately with an empty array

That’s exactly what it’s documented to do: read the whole file and return the entire contents. If you later change the file (e.g. call write), that’s a different issue, as a filesystem is not capable of performing the operation you are describing.


#4

x-ref: https://github.com/JuliaLang/julia/pull/20460


#5

The normal thing is just to call a read function in a task (a “green thread”, e.g. spawned with @async). The task will wake up when there is data available to read (assuming another task is also blocked or yielding).