PyGen - python style generators

package
announcement

#21

That’s exactly why they are very different. Python being an interpreter essentially make every function a closure (they explicitly keeps their states) except that they call it frames. Julia tasks switches native stacks with is very different.


#22

With the do syntax how would you specify the type and size of the Channel?


#23

Specify the ctype and csize keyword arguments:

julia> f(x) = Channel(ctype=Int, csize=5) do c
           push!(c, ...)
       end

#24

And even better (in my opinion), combining it with short-form function syntax,

f(x) = Channel() do c
    ...
    push!(c, y)
    ...
end

Effectively a macro-free way of defining a python-like generator.

This seems substantially simpler than what I’ve been doing, which is following suggestions from this 2014 blog post:

function f(x)
    ...
    function _it()
        ...        
        produce(y)
    end
    Task(_it)
end

@nsmith - sounds like this is similar to what you’re doing in your macro? When I read the release notes, I was worried this would stop working - this is switching to channels now?


#25

@kevbonham Yeah, in 0.6, produce and consume are deprecated in favour of Channels. Some new Channel constructors have been made to emulate the produce, consume pattern more easily, which is what @fengyang.wang is using above.

Before v0.6 the equivalent clever idiom would be

f(x) = Task() do
    ...
    produce(y)
    ....
end

#26

Got it - thanks!

It’s a bit annoying from a learning standpoint that there’s nothing that clearly flags where the yield equivalent statement is. push! is pretty generic. So +1 for getting some dedicated syntax in Base.

Also @davidanthoff seemed to indicate there’s some overhead?


#27

Yeah, there is a price to context switching back and forth between the producer Task and consumer Task. The [C# implementation] (https://blogs.msdn.microsoft.com/oldnewthing/20080812-00/?p=21273) expands the generator code in the callers frame in the form of a pile of goto statements so there is no context switching overhead. In 0.6 the new @label and @goto macros could be used to implement something similar, which would be pretty neat.


#28

I did some measurements using the following code and that’s quite impressive.

using PyGen

N     = 10000000
Ndisp = 1000000

println("for loop")

function for_loop()
    for i in 0:N
        if i % Ndisp == 0
            println(i)
        end
    end
end


println(@elapsed for_loop())

println()
println("pygen generator")

@pygen function pygen_generator()
    for i in 0:N
        if i % Ndisp == 0
            yield(i)
        end
    end
end

function using_a_pygen_generator()
    for i in pygen_generator()
        println(i)
    end
end

println(@elapsed using_a_pygen_generator())

It will be great to have something like that include in Base.

If this can’t be add to Base and it’s still a standalone package, I think it should be renamed.
PyGen made me think that it was using Python !

Can someone post here a similar sample code with Channel?


#29

FYI, a loop over 10000000 takes no time compare to printing 10 numbers and the dispatch caused by global variables.


#30

I tested previously Task produce/consume with older Julia versions and it was much longer (even for some basic tasks like that)

I’m experimenting Channel for the first time.

channel_generator() = Channel() do c
    for i in 0:N
        if i % Ndisp == 0
            push!(c, i)
        end
    end
end

println()
println("channel generator")
function using_a_channel_generator()
    for i in channel_generator()
        println(i)
    end
end

println(@elapsed using_a_channel_generator())

According you, what kind of measurements should be done to ensure that it doesn’t slow down too much?


#31

As a start, make N and Ndisp constant globals.


#32

Also, FYI, these are not new in 0.6. They exist in 0.3.


#33

That’s very impressive!
Why is there so much speed difference?


#34

With

using PyGen

const N     = 10000000
const Ndisp = 1000000

println("for loop")

function for_loop()
    for i in 0:N
        if i % Ndisp == 0
            println(i)
        end
    end
end


println(@elapsed for_loop())

# ===

println()
println("pygen generator")

@pygen function pygen_generator()
    for i in 0:N
        if i % Ndisp == 0
            yield(i)
        end
    end
end

function using_a_pygen_generator()
    for i in pygen_generator()
        println(i)
    end
end

println(@elapsed using_a_pygen_generator())

# ===

channel_generator() = Channel() do c
    for i in 0:N
        if i % Ndisp == 0
            push!(c, i)
        end
    end
end

println()
println("channel generator")
function using_a_channel_generator()
    for i in channel_generator()
        println(i)
    end
end

println(@elapsed using_a_channel_generator())

it seems that PyGen is nearly 4x slower!

for loop
0
1000000
2000000
3000000
4000000
5000000
6000000
7000000
8000000
9000000
10000000
0.075400854

pygen generator
0
1000000
2000000
3000000
4000000
5000000
6000000
7000000
8000000
9000000
10000000
0.204231019

channel generator
0
1000000
2000000
3000000
4000000
5000000
6000000
7000000
8000000
9000000
10000000
0.056583126

#35

Hi

The latest version of SimJulia has an implementation of C# sharp style generators, i.e. a function yielding values is transformed in a finite state-machine.

@resumable function fib()
    a = 0
    b = 1
    while true
        @yield return a
        a, b = b, a+b
    end
end

fib_gen = fib()

for i in 1:10
    println(fib_gen())
end

This approach is a lot faster than produce/consume or the newer channels if the output can not be buffered as is done when using channels.
If there is some interest, I can take this out of SimJulia and make it an independent package.


#36

It’s way more than 4x slower. The print is still way slower than anything else for the normal loop approach.

Yes, that’s the correct way to implement this.


#37

Thats excellent @BenLauwens! I think there is interest in a independent package for this. I was approached a few times about publishing PyGen but I felt the C# sharp style was the right solution and didn’t have the time to implement it. Also the naming thing (@resumable is a good name by the way).

@FemtoTrader I’m not sure why your Channel implementation is faster than PyGen. Maybe there was a type instability or something in my implementation


#38

I think the Channel implementation is faster because the function push!(c, i) buffers the results and the print loop reads from the buffer. This is a lot faster because no Task switching is done. However as I do in SimJulia the yielding of values and the use of the values are to be synchronised, so no buffering can be allowed. Doing the same benchmark with a Channel(0) object gives very different results.


#39

@BenLauwens I’m trying with SimJulia

using SimJulia

println()
println("simjulia generator")

@resumable function simjulia_gen()
    for i in 0:N
        if i % Ndisp == 0
            @yield return i
        end
    end
end

function using_a_simjulia_generator()
    simjulia_generator = simjulia_gen()
    for i in simjulia_generator()
        println(i)
    end
end

println(@elapsed using_a_simjulia_generator())

but it only displays

simjulia generator
0
0.026708957

Any idea what is going wrong?


#40

@FemtoTrader I have not yet implemented the iterator interface.

import Base.done, Base.next, Base.start

using SimJulia

start(fsm::T) where T<:FiniteStateMachine = fsm._state

next(fsm::T, state::UInt8) where T<:FiniteStateMachine = fsm(), fsm._state

done(fsm::T, state::UInt8) where T<:FiniteStateMachine = fsm._state == 0xff

@resumable function simjulia_gen()
    i = 0
    while true
        if i % Ndisp == 0
            if i + Ndisp < N
                @yield return i
            else
                return i
            end 
        end
        i+= 1
    end
end

N = 100
Ndisp = 3

function using_a_simjulia_generator()
    for i in simjulia_gen()
        println(i)
    end
end

println(@elapsed using_a_simjulia_generator())

@yield inside a for loop is not yet possible… the for loop is rewritten with an internal variable #temp during the lowering process that I can’t capture in the macro. This is one of the reasons that C# sharp style generators should be implemented in core Julia. This is a straightforward extension of closures…

To compare with the other generators, you can better not print the results. println takes more time than the task switching or the function calls.

Another possibility is the use of llvm coroutines. I have no idea if someone has already tried to use them in Julia.