What is a Base.Generator?

I’ve been looking at some sample code from th eFlux model zoo, and I came across something like:

f(x) = x^2
d = (f(i) for i in 1:10)

(this is much simplified to help pose the question). The result indicates that d is Base.Generator{UnitRange{Int64},typeof(f)}(f, 1:10) - I cannot find any documentation of Generator at docs.julialang.org. It is clear that if I apply collect(d) I get the expected output array (1,4,9,…). However, I could just as easily get that result from using brackets instead of parentheses:

d = [f(i) for i in 1:10]

So, what is a Generator, and why would one utilize it rather than directly producing the Array?

Thanks in advance for tips!

3 Likes

The idea behind a generator is that it can produce it’s elements one at a time without allocating storage for all of them. You could for example, write 'for x in d` and your will iterate over the parts without having to allocate. This is very useful if the list certain would be really large.

8 Likes

Thanks - that makes complete sense.

I still don’t know why I can’t find this on the docs site especially since it seems so useful

2 Likes

For some reason it doesn’t come up as one of the first few results when searching for “generator”, but it’s there:

https://docs.julialang.org/en/v1/manual/arrays/#Generator-Expressions-1

3 Likes

I can try again later - I typed in generator and then did a search in the resulting web page for Generator and never found it

If you’re new to these kinds of languages I highly reccommend reading up on generators and iterators :). They make life beautiful.

1 Like

Yea the point and advantage is clear given the explanation above.

However I just tried typing Generator into the docs.julialang.org search box again.

I read the entire list of items that shows up and there is nothing relevant. So I’d love to have been able to read about it as you suggest but the docs aren’t helping…

3 Likes

Those docs describe comprehensions. Do you know if there’s a API for creating generators outside of comprehensions?

The closest thing is probably the iteration iterface which I’ve used many times myself to do generator-like things. The actual Base.Generator object created by the comprehension syntax is considered an implementation detail I believe.

If you really want to, you can construct a Base.Generator the same way you would any other type. The first argument is a function that’s called on each value of the second argument which is an iterable:

julia> x = Base.Generator(inv, [1,2,3])
Base.Generator{Array{Int64,1},typeof(inv)}(inv, [1, 2, 3])

julia> for i in x
         println(i)
       end
1.0
0.5
0.3333333333333333
2 Likes

This is not a well-documented topic, so the fact that you had a hard time finding it is not surprising.

Generator expressions are first-class objects, which allow you to decouple the generation from the collection in [ ... ] expressions.

The promise of the ( .... for ... [if ...]) syntax is that you get something that conforms to the iteration interface. Consider it a shorthand for implementing simple custom iterators using mapping and filtering, mostly in a performant way (except for a few edge cases). A lot of this functionality overlaps with Base.Iterators, predating that module.

Base.Generator is the particular implementation for this. It is not exported, and (technically) it may change in the future. It is not good practice to use it unless you are prepared to update your code in case that happens. Incidentally, it is used to implement collect and friends, but again that can change.

6 Likes

If use of Base.Generator is discouraged generally, what is the idiomatic iterator version of map?
Aka, for filter we have Iterators.filter, and for map, what do we have here?
(sure, writing (f(x) for x in D) works (even though being ugly for long function names) most times, but there is no do-block support.
How to get a lazy map with do-block support?

Using generators is not discouraged at all, just keep in mind that they should be constructed using ( ... for ...), not using Base.Generator.

Alternatively, there is the excellent

https://github.com/JuliaCollections/IterTools.jl

package, which has imap.

2 Likes

Of course, that is what I meant. It’s just sad, that functional-oriented syntax does not support the do-block syntax. Even though Base Julia usually tries to offer it all the time.

I am not sure what the issue is. do blocks are just syntactic sugar for longer function bodies — if you insist, you can do

[begin
    z, w = do_complicated_stuff(x, y)
    A, B = many_lines_follow_this(z, w)
    K, K' = this_is_where_you_lose_track(A, z, w, B)
 end for (x, y) in zip(xs, ys)]

It’s just not particularly good style, IMO. But there is no reason to be sad. :wink:

Sorry for sounding harsh, but what about that is do block syntax? The whole point is to decouple parameters and iteration variables whilst getting a way more readable code snippet. That failed here on the whole line.
OTOH it is not just about how to do it in general (sure, I could even ccall an assembler snippet inlined in C), but why the general rule of Base (that is, “functional patterns support do syntax”) was violated here. The order of arguments for Base.Generator is even built with the function argument in the first place. Why? I guess because recommended argument order suggests that. Why? To be able to use do-block syntax. So why not expose that?

I am not quite sure I understand what you as asking for in relation to comprehensions and generators. Specifically,

f(other_args...) do args...
    body
end

is just sugar for

f(args -> body, other_args...)

Since there is no function argument to comprehensions or generators, how would do fit into them? You can just use the body directly.

Sorry, I am genuinely puzzled about what you are suggesting here. Can you give an example where the proposed syntax would help?

Workflow:
write the code eagerly with filter and map. Later on you see that you need to save some space or only want partial evaluation → switch to an iterated variant.
For filter that just means to prepend some Iterators.
But for map this usually implies either importing a new package (IterTools.jl) or changing the structure of the code while losing some readability if the function is large.
The last function of the almost holy trinity of functional programming, reduce, doesn’t need a lazy variant as this doesn’t really make any sense. But the others in my opinion should have a streamlined and common pattern. Base.Generator doesn’t fit into it right now.
In my opinion, a Iterators.map(fun, args...) should be exported. Could be implemented as Base.Generator(fun, zip(args)). It’s just that missing such a feature from Base feels strange for a language that focuses a lot on functional programming and has nice syntax for interweaving anonymous functions into other code.

So is it correct to say that you are looking for IterTools.imap, except that you would prefer it to be available as Iterators.map?

I am not sure that Base.Generator needs to fit into any kind of external API — currently, it is an implementation detail.

I don’t think that using a package for something is that much of a hassle, but YMMV. If anything, I would prefer Base.Iterators to be moved out and merged into IterTools.jl, for the usual reasons (consistent API, more frequent releases, availability of new features in LTS Julia releases, etc).

1 Like

filter&map (as somewhat primary functional features) should support lazy/iterated evaluation in Base consistently or not at all (in this regard) IMO, so yes, there should be an exported lazy map in Base. Effectively stopping to export Iterators and thus making it an internal package on the other hand would work too, in my case, as that would hopefully remove that half baked feeling but makes you decide: You want lazy collections? Go for IterTools or Transducers, you’re fine with eager evaluation? Stick to Base.
Currently it is: You want lazy collections? Stick to Base. Meh, you need IterTools. Well, it depends. You could go for Transducers aswell, but then you won’t need Base.Iterators anymore.

1 Like

There’s a related github issue talking about a broadcast-like syntax for laxy map that might be of interest.

Personally, I think defining an Iterators.map is a no-brainer enhancement for discoverability regardless of the question of merging IterTools and Iterators. Why not open a PR implementing it? If you go the route of using Base.Generator, it’s only a single line of code.

2 Likes