I find it’s fairly trivial to implement Python style generator functions for Julia, though only unidirectional stream production.
I use yieldto but doc says “Its use is discouraged.”, how idiomatic my approach is? I’m fairly new to Julia, eagerly learning.
Surface syntax:
julia> @generator function g(n::Int)
m = n * 3
@yield m
m += 7
@yield m
m -= 9
@yield m
end
g (generic function with 1 method)
julia> for i in g(5)
println(i)
end
15
22
13
julia> collect(g(5))
3-element Vector{Int64}:
15
22
13
julia>
I think the style developed in FGenerators.jl combined with FLoops.jl is a more performant and well founded way to go about this, rather than spawning tasks and trying to properly manage the yielding. But it relies heavily on transducers, so that is a lot of infrastructure to digest and learn.
Here’s an example where I believe your use of yieldto fails:
julia> using .PyStyleButUnidirGenerators
julia> @generator function organpipe(n::Integer)
i = 0
while i != n
i += 1
@yield i
end
while true
i -= 1
i == 0 && return
@yield i
end
end;
julia> collect(organpipe(2))
I left this on my computer for about 5 minutes and it never completed. I assumer there is some sort of task deadlock. I think if you want to do this with tasks, you’re better off making a channel, and then put!ing data into that channel at each yield, rather than trying to manage the task switching yourself.
Here’s how FGenerators.jl performs with that organpipe:
julia> using FGenerators
julia> @fgenerator function organpipe(n::Integer)
i = 0
while i != n
i += 1
@yield i
end
while true
i -= 1
i == 0 && return
@yield i
end
end;
julia> let n = Ref(100)
@btime sum(organpipe($n[]))
@btime collect(organpipe($n[]))
end;
13.867 ns (0 allocations: 0 bytes)
1.456 μs (204 allocations: 13.80 KiB)
Nice to see so many approaches have been attempted (the See also section of GeneratorsX.jl lists many of them)! Also parallelism and solo performance are tackled well.
julia> using PyStyleButUnidirGenerators
julia> @generator function organpipe(n::Integer)
i = 0
while i != n
i += 1
@yield i
end
while true
i -= 1
i == 0 && return
@yield i
end
end
organpipe (generic function with 1 method)
julia> collect(organpipe(2))
3-element Vector{Int64}:
1
2
1
julia>
For cases not concerning HPC (parallelism included), I still favor my cooperative-scheduling based implementation over Channel based ones (all in the low-performance camp), I see Channel comm is somewhat over demanding compared to cooperative-scheduling. I.e. as like the Python generators, the producing and consuming are interleaved and always synchronous, with cooperative-scheduling, no cost of mutex or memory barrier/fence would incur, unless the procedures perform async comm meanwhile (which can be nicely opted straight forward).
This somehow aligns to C++'s philosophy that “Pay only for what you use”.
I’m surprised none of those attempts addresses Python generator’s backward communication facility, i.e. .send() method of a generator call instance. Or I missed something?
Though that’s of limited usefulness, and Julia’s yieldto() works almost the same - just record your caller Task as a yield target (my impl. just leveraged that).
I feel Julia’s macro system can give even better ergonomics in use cases of that, but don’t have a particular one in my head atm.
Julia’s Task is very different from “stackless” coroutines/generators of Python/C++/Rust/etc. In Julia, a genuine call stack is allocated when starting a Task. Furthermore, the optimizer does not reason about the code across multiple tasks. As such, yieldto-based implementation of the generator (at least currently) will have significant non-optimizable overheads.
Coroutines and generators are interesting programming devices and it’d be nice to see more uses in Julia. For example, it’s useful for writing composable parsing tools. But I think there’re not many uses of yieldto in the Julia ecosystem because of the overhead of yieldto and people tend to be crazy about performance.
I guess each Task has its own dedicated native stack, so it runs at native machine speed until the next yieldto or other yield points, this is pretty lovable as with Julia
And with majority of high-performance parts well optimized, I suggest once you need to yield, there usually be some not-quite-performance-friendly situations to handle, e.g. needs to expand the capacity of some buffer or series storage backed by some database, then some slowdown could be relative forgivable in such cases, as long as full machine speed can be frictionlessly resumed after the situation settled. I see Julia being superior in achieving such a goal.
In handling low frequent situations, especially complex (w.r.t. business rules etc.) ones, ergonomics might be more valuable than run-speed, as code maintenance and other software-engineering endeavors could be rather more costing, over marginally less-time-to-run as in sense of business values for a return.
I’ve been longing to code in Julia since years ago, but can only embark until recently. Loving Julia as always, but tbh there’s a little pity that I feel it kinda being a DSL for high-machine-performance programming domain, I’d expect esc, gensym, fieldcount and alikes to only live in Meta and require explicit citation to use, but they are exposed from Base. Personally I’d regard them pollution to the conception space of business domain, when I deliver business functionalities to analytic team of my org, if as in Julia API. Analysts should learn Julia for sure, but bloated interfaces and tools with performance-optimization focus are, not welcoming to citizen developers consisting of many non-programmers but deeply involved in the computer-powered business.
I’m experimenting with composable grammar (syntax+semantics+pragmatics) components based on Julia, and feeling delighted. Hopefully I can share something in the near future.
Yes, I later went over it indirectly through links in @Mason 's reply.
I’m aware its Channel based, and seemingly neither support communication back into the generator from outside. Also the limitations listed in its Caveats section of readme:
In a try block only top level @yield statements are allowed.
In a finally block a @yield statement is not allowed.
An anonymous function can not contain a @yield statement.
None of those exists in my cooperative-scheduling implementation, I suspect.
I’m pretty sure it’s not channel-based? The readme explicitly calls out channels as slower for this use case and even benchmarks against them. The same benchmarks also include a task-based implementation. WRT calling back into the generator, that’s supported via Manual · ResumableFunctions.
The two-way communication supported there is quite like Python’s .send() semantics too.
But with Julia’s extra macro-fu, I’d imagine something more ergonomic could get en-sugar-ed, like:
@generator function questionnaire()
name = @yield "Who are your?"
from = @yield "Where are you from?"
age = @yield "How old are you?"
return "Hello, $name from $(from)!\n" *
"You are born in $(year(today())-age)."
end
Then:
@consume questionnaire() do q
if q === "Who are your?"
@feedback "John"
elseif q === "Where are you from?"
@feedback "Chicago"
elseif q === "How old are you?"
@feedback 21
end
end
Which evaluates to "Hello John from Chicago!\nYou are born in 2001."
This use case is too fictional to be useful in real world cases, but seems interesting and maybe someone (including future me) can find a really useful scenario.