PyGen - python style generators

It is also worth pointing out that this syntax has been around in C# for a very long time. Not clear to me whether Python or C# introduced it, but in any case, that would also support a name that doesn’t have “Python” in it.

I actually really wish julia had this build in, like Python and C#…

2 Likes

Oh, fair enough! Yeah, seems especially narrow to call it Py-anything then doesn’t it.

I have to agree about wishing this idiom was in julia natively.

function f(x)
...
   yield y
...
end

is much more clean than the alternative:

function f(x)
    function temp(c)
    ...
        push!(c, y)
    ...
    end
    return Channel(c -> temp(c))
end

Though, that might just be my lack of creativity…

3 Likes

FYI, Channel(c -> temp(c)) can be simplified to just Channel(temp). Even better, with do-syntax:

function f(x)
    return Channel() do c
        ...
        push!(c, y)
        ...
    end
end
6 Likes

@cstjean nice!

And even better (in my opinion), combining it with short-form function syntax,

f(x) = Channel() do c
    ...
    push!(c, y)
    ...
end

Effectively a macro-free way of defining a python-like generator.

10 Likes

Yeah, I think I have to agree. I’ve always had reservation about the do-syntax because I find it reads a little weird, but this statement as really great. Its very clear that you’re filling up a channel with values.

1 Like

Ah, I might have misunderstood the Python use of yield. I would like the C# version of yield, which simply is a shortcut to implementing an iterator, without the overhead of Channel etc. That is what I would really like to see in julia.

2 Likes

You’re partially correct about the python yield statement from what I understand. They construct a generator which is a type of iterator, but I think that iterator is implemented with a type of coroutine, similar to the julia Task. The @pygen macro implements the closest julia equivalent I can find: a Task that communicates over a Channel in v0.6, or previous to v0.6 a Task that communicates via produce() and consume().

Now, if there was a way to compile a function with yield statements into a nice high performance iterator (sounds like c# does this?) that would be amazing… A brief play with these Channels shows they’re no competition for an iterator when it comes to performance.

No I don’t think it’s similar to Task.

You can see what they actually are here under the section “How Python Generators Work”. I couldn’t find much detail on the Julia’s Task implementation but looking at the source is smells pretty similar (a stack frame + pointer to the instruction we’re at currently etc).

I am kind of curious about what C# is doing though, just because some calls itself an “iterator” doesn’t preclude the same co-routine stuff to be happening in the background I suppose.

It seems that C# got away without a stack. They just expand the whole thing to in the callers stack as an object carrying the state and a bunch of goto statements to hop from where the iterator consumed and back again. Article here.

That’s exactly why they are very different. Python being an interpreter essentially make every function a closure (they explicitly keeps their states) except that they call it frames. Julia tasks switches native stacks with is very different.

With the do syntax how would you specify the type and size of the Channel?

Specify the ctype and csize keyword arguments:

julia> f(x) = Channel(ctype=Int, csize=5) do c
           push!(c, ...)
       end
3 Likes

And even better (in my opinion), combining it with short-form function syntax,

f(x) = Channel() do c
    ...
    push!(c, y)
    ...
end

Effectively a macro-free way of defining a python-like generator.

This seems substantially simpler than what I’ve been doing, which is following suggestions from this 2014 blog post:

function f(x)
    ...
    function _it()
        ...        
        produce(y)
    end
    Task(_it)
end

@nsmith - sounds like this is similar to what you’re doing in your macro? When I read the release notes, I was worried this would stop working - this is switching to channels now?

@kevbonham Yeah, in 0.6, produce and consume are deprecated in favour of Channels. Some new Channel constructors have been made to emulate the produce, consume pattern more easily, which is what @fengyang.wang is using above.

Before v0.6 the equivalent clever idiom would be

f(x) = Task() do
    ...
    produce(y)
    ....
end

Got it - thanks!

It’s a bit annoying from a learning standpoint that there’s nothing that clearly flags where the yield equivalent statement is. push! is pretty generic. So +1 for getting some dedicated syntax in Base.

Also @davidanthoff seemed to indicate there’s some overhead?

Yeah, there is a price to context switching back and forth between the producer Task and consumer Task. The [C# implementation] (https://blogs.msdn.microsoft.com/oldnewthing/20080812-00/?p=21273) expands the generator code in the callers frame in the form of a pile of goto statements so there is no context switching overhead. In 0.6 the new @label and @goto macros could be used to implement something similar, which would be pretty neat.

I did some measurements using the following code and that’s quite impressive.

using PyGen

N     = 10000000
Ndisp = 1000000

println("for loop")

function for_loop()
    for i in 0:N
        if i % Ndisp == 0
            println(i)
        end
    end
end


println(@elapsed for_loop())

println()
println("pygen generator")

@pygen function pygen_generator()
    for i in 0:N
        if i % Ndisp == 0
            yield(i)
        end
    end
end

function using_a_pygen_generator()
    for i in pygen_generator()
        println(i)
    end
end

println(@elapsed using_a_pygen_generator())

It will be great to have something like that include in Base.

If this can’t be add to Base and it’s still a standalone package, I think it should be renamed.
PyGen made me think that it was using Python !

Can someone post here a similar sample code with Channel?