Strange interaction between `Channel()` and pipes


#1

Hello,

I am trying to make a generator for reading a list of `files’, where a file may be an actual file or a subprocess. For the interface it is nice if the result can be a file descriptor, as the interpretation of the content depends on the caller.

I make the generator with Channel()push!() and the subprocess using open(``command``) [where `` is a single backtick but I don’t know how to enter this in here].

The problem is that the subprocess’s file descriptor produces 0 bytes when read from by the caller. The problem is maybe best demonstrated by the following code:

genpipe(file, pipe=false) = Channel() do c
	for i in 1:2
		if pipe
			fd, process = open(`cat $file`, "r")
		else
			fd = open(file, "r")
		end
		push!(c, fd)
		close(fd)
	end
end

open("out.txt", "w") do fd
	println(fd, "Hello, pipe")
end

for pipe in [false, true]
	for fd in genpipe("out.txt", pipe)
		println("pipe ", pipe, " len ", length(read(fd)))
	end
end

Result (julia-0.6):

pipe false len 12
pipe false len 12
pipe true len 0
pipe true len 0

The last two lines are unexpected.

I’ve also tried the open() do fd construct within the generator, but this gives the same result.

Cheers,

—david


#2

Since you’re calling close(fd) after a very short pause, it seems unsurprising that cat didn’t have enough time to copy the results out, but loading the file directly did.


#3

Thanks,

Well, I don’t really understand the intricacies of Channel()push!(), bu so far I’ve been using it as in a generator structure similar to python generators.

But to my understanding and experience, the close(fd) (either explicit, as here, or implicit if coded in a do construct) does not get run until the next iteration from the caller.

For now, I can solve this by pushing IOBuffer(read(fd)) instead of fd, but this reads the entire file / process output into memory which might defeat the purpose.

—david


#4

You are expecting the push! to the channel to block?
The default Channel size is 32 and a push! will only block once the channel is full.
Channel() do ... end will actually create an unbuffered Channel (csize=0), so push! will never block.

So since the push! doesn’t block, close will run immediately.


#5

OK, then, what is the ultimate Julia equivalent of a python generator?

—david


#6

You are expecting the push! to the channel to block? Channel() do … end will actually create an unbuffered Channel (csize=0), so push! will never block.

The online help of Channel() says

Channel(0) constructs an unbuffered channel. put! blocks until a matching take! is called. And vice-versa.

which is what I actually want, but is the opposite of what you stated.

—david


#7

You are right, I misread the documentation. The issue is slightly more complicated…

If you take a look at https://github.com/JuliaLang/julia/blob/217e059808a4e04900a48f9e2f03069a7038af32/base/channels.jl#L289

There is the potential for a race here

Task1             Task2
   |                |
  push!             |
   |                |   
   |              take!
   |   <---      yieldto!
 yield(v) --->     |
   |               |
 close           read

There is no guaranteed order in which close and read are going to run in your code.
It tries to hand over execution to Task2 preferentially, but if there is a yield point in Task2
Task1 will continue to execute.

for pipe in [false, true]
    for fd in genpipe("out.txt", pipe)
        yield()          
        println("pipe ", pipe, " len ", length(read(fd)))
    end
end

#8

Thanks, I see the problem now. It is, in a way, strange that the construct works so well for non-piped iobuffers.

I started writing the generator with a do block construct, hoping that the implicit close() would be triggered by destruction of the IO object, which would then happen when the caller iterates to the next item. However, that didn’t work either, so I guess my assumption about how the open() do construct works is false.


Concise multiple do-block