Understanding generated functions

That’s great! The number of use-cases for generated functions are indeed diminishing as Julia’s plain-old functions improve and we learn to compose things better. I think a great example here is multidimensional iteration… particularly if you’re interested (or will indulge me) in a walk through a bit of Julia’s evolution.

One limitation of Julia’s plain-old-functions (especially in pre-historic times) is that used to be difficult to express multidimensional iteration over an arbitrary N-dimensional array without leaning on linear indexing (which might be slow). There’s no builtin syntax for expressing an arbitrary number of for loops. You can manually say:

sum(v::Vector{Int}) = (r=0; for i in 1:size(v, 1); r += v[i]; end; r)
sum(A::Matrix{Int}) = (r=0; for j in 1:size(A, 2); for i in 1:size(A, 1); r += A[i,j]; …
sum(A::Array{Int,3}) = (r=0; for k in 1:size(A, 3); for j in 1:size(A, 2); for i in 1:size(A, 1); r += A[i,j,k]; …
sum(A::Array{Int,4}) = (r=0; for l in 1:size(A, 4); for k in 1:size(A, 3); for j in 1:size(A, 2); for i in 1:size(A, 1); r += A[i,j,k,l]; …

But that’s awful and tedius and terrible to maintain. Julia has a really good set of metaprogramming tools, though, and so we could instead do something like:

for N in 1:4
    @eval sum(A::Array{Int,$N}) = (r=0; $(construct_n_for_loops(N, …)))
end

That’s still really annoying to read and write, and it’s still hard to write that construct_n_for_loops AST generation method — which was commonly needed. So a very clever person came along and expressed that all as a clever set of macros instead (yes, I’m intentionally linking to a really old doc):

@ngenerate N Int function sum{N}(A::Array{Int,N})
    r = 0
    @nloops N i A begin
        r += @nref N A i
    end
    r
end

That old 0.3 syntax said to generate those four methods for A::Array{Int, N} with N for loops and indexed into A with N indices. The @nloops macro creates those loops, using variables i_1, i_2, etc, as indices structured to access all the dimensions of array A. It still exists — check out what it does for:

using Base.Cartesian
@macroexpand @nloops 4 i A begin
    r += @nref 4 A i
end

Later on, Julia introduced generated functions. This meant that we no longer needed to only generate a subset of these methods — Julia could automatically generate whatever it needed on demand, based purely upon the types of the arguments. We reimplemented all these multidimensional structures as generated functions instead and deprecated @ngenerate (this is now 0.6 syntax):

@generated function sum(A::Array{Int})
    N = ndims(A)
    quote
        r = 0
        @nloops $N i A begin
            r += @nref $N A i
        end
    end
    r
end

You’re probably now looking at me like a crazy person, since, why didn’t we just use a normal function with just one loop for i in eachindex(A)? Precisely. It took some time to realize that generated functions could be used to implement those fast cartesian iterators. It took even more time to realize that we didn’t need to use generated functions to implement those iterators — inlining alone was a powerful enough tool. This story is long enough, though, the power of inlining “lispy” recursive functions in this example can be another story for another day.

This got long, but I hope it gives you a sense of how generated functions can be a form of very powerful “on-demand” code generation… but as Julia’s native codegen improves you don’t need to do this as often. I believe the most common use of generated functions now is just to do compile-time processing of type parameters to preserve type-stability.

26 Likes