Higher-order lazy ranges/iterators

Will Julia (under the hood) in the future fuse the loops of calls such as

map(x -> x + 1, map(x -> x * 2, [1, 2, 3]))

into a single loop, requiring only a single temporary element, regardless of the size of x? Similar to what D’s lazy ranges (std.algorithm.iteration.map) and Rust’s iterators does?

Further it would be cool if it could optimize away any temporary allocations at all in cases such as

reduce(x -> x + 1, map(x -> x * 2, [1, 2, 3]))

without the user being aware of it, as he or she has to be in D, Rust and similar where lazyness is explicit to the programmer.

Ahh, I see now that the Julia already has this optimization because of symbol _mapreduce occurring in the error message:

ERROR: MethodError: no method matching (::getfield(Main, Symbol("##41#43")))(::Int64, ::Int64)
Closest candidates are:
  #41(::Any) at REPL[10]:1
Stacktrace:
 [1] _mapreduce(::typeof(identity), ::getfield(Main, Symbol("##41#43")), ::IndexLinear, ::Array{Int64,1}) at ./reduce.jl:311
 [2] _mapreduce_dim(::Function, ::Function, ::NamedTuple{(),Tuple{}}, ::Array{Int64,1}, ::Colon) at ./reducedim.jl:305
 [3] #mapreduce#538 at ./reducedim.jl:301 [inlined]
 [4] mapreduce at ./reducedim.jl:301 [inlined]
 [5] #reduce#539 at ./reducedim.jl:345 [inlined]
 [6] reduce(::Function, ::Array{Int64,1}) at ./reducedim.jl:345
 [7] top-level scope at none:0

! Cool!

What kinds of mappings (computation graph node merges) other than

map-reduce to mapreduce

is the optimizer capable of performing?

We do implement loop fusion — but it’s a part of broadcasting. While it looks a little funny with anonymous functions, this does exactly what you’re after:

julia> (x -> x + 1).((x -> x * 2).([1, 2, 3]))
3-element Array{Int64,1}:
 3
 5
 7

You can also use some of the lazy structures folks have developed in packages, like MappedArrays.jl, to achieve a similar effect.

Note that reductions take two-argument functions as input as they pass both the “running” output and the next element in the collection to the function. E.g.,

julia> reduce((x,y)->x+y, map(x -> x * 2, [1, 2, 3]))
12

And as you’ve noted, this can be done more efficiently all at once with the mapreduce function

1 Like

It seems function chaining also fuses the loops, right? So, another way of writing it is:

julia> [1,2,3] .|> x->2x .|> x->(x+1)
3-element Array{Int64,1}:
 3
 5
 7
1 Like

Yes, that is effectively the same. It’s the broadcasting dots that fuse the loops, and there you’re broadcasting the function chain operator (.|> instead of just |>).

2 Likes

Amazing! the two solutions have identical run times:

julia> @btime  (x -> x + 1).((x -> x * 2).([1, 2, 3]))
  52.088 ns (2 allocations: 224 bytes)
3-element Array{Int64,1}:
 3
 5
 7

julia> @btime  [1,2,3] .|> x->2x .|> x->(x+1)
  52.088 ns (2 allocations: 224 bytes)
3-element Array{Int64,1}:
 3
 5
 7