Nested (non-syntactic?) loop fusion in 0.6

marius311 · March 8, 2017, 4:08pm

Is there a way for loop fusion in Julia to operate even recursively into function calls? By that I mean, consider,

julia> a=b=rand(2000,2000);

julia> @time a.+b.+a;
  0.064173 seconds (10.63 k allocations: 31.015 MiB, 2.54% gc time)

julia> myplus(a,b) = a.+b
myplus (generic function with 1 method)

julia> @time a.+myplus(a,b);
  0.058254 seconds (53 allocations: 61.037 MiB, 4.30% gc time)

Its clear the latter call wasn’t fused into a single loop since twice the memory was allocated.

My best guess is this isn’t supported, and maybe never will be because the ambiguity of what to do if myplus has other side-effects. However, is there any way to make something similar to this work, to allow shorthand for some longer operation than myplus? It certainly seems within the realm of reason that the compiler could figure out in simple cases like this that its safe to fuse.

pabloferz · March 8, 2017, 4:17pm

You can write a .+ myplus.(a, b)

StefanKarpinski · March 8, 2017, 4:29pm

In this case, @pabloferz’s suggestion will fuse, but it wouldn’t work if your user-defined function didn’t happen to also do the scalar operation you want – or more generally, if you have an operation that cannot be expressed as a vectorization of any scalar operation (e.g. matmul, matrix inversion). In such cases, we probably will never be able to guarantee fusion, although a sufficiently clever compiler optimization could observe that fusion is possible and do it for you.

At a higher level, the new syntactic loop fusion support means that one should shift programming styles away from what’s traditionally been done in vectorized languages:

In vectorized languages, one wants to push vectorization as far “up” as possible: compose more complex operations out of simpler vectorized operations, and then let vectorization flow from the top down to individual, optimized vectorized operations.
In Julia since 0.5, one wants to do the opposite and push scalar operations as far up as possible: express complex operations in entirely scalar form for as long as you can (i.e. until you hit an inherently array operation like a matmul or matrix inversion or whatever), and then vectorize at the very end by calling functions using f.(args...) syntax.

So in this case, if myplus can be defined to only operate on scalars, then do that – don’t define it to work on vectors at all. Instead, if you need to call it on vectors, use the myplus.(a, b) form.

marius311 · March 8, 2017, 5:45pm

Thanks yea, I realized after @pabloferz’s answer that I made my example here too simple, indeed what I’m actually trying to do wouldn’t work because it involves a matrix multiply.

In any case, useful advice on programming style, I will keep it in mind!

ChrisRackauckas · March 8, 2017, 6:20pm

Instead of loop fusion for matrix multiplication and the like, you can just chain inplace operations. That will still avoid any memory allocations.

Topic		Replies	Views
Element-wise vector multiplication and fusing dot New to Julia	12	16674	January 2, 2017
Vectorized math Performance	12	667	February 17, 2021
Higher-order lazy ranges/iterators Performance	4	808	August 24, 2018
Newbie question about loop fusion and broadcasting New to Julia	9	1009	June 12, 2020
Blog post: Loop fusion and vectorization in Julia 0.6 Internals & Design announcement , broadcast	28	8401	May 4, 2017

Nested (non-syntactic?) loop fusion in 0.6

Related topics