Understanding generated functions


#1

Hi all. I’m trying to understand how generated functions work. From the 0.6 manual:

A very special macro is @generated, which allows you to define so-called generated functions. These have the capability to generate specialized code depending on the types of their arguments with more flexibility and/or less code than what can be achieved with multiple dispatch. While macros work with expressions at parsing-time and cannot access the types of their inputs, a generated function gets expanded at a time when the types of the arguments are known, but the function is not yet compiled.

So my simplistic mind has the following paths, in the order of preference:

  1. Use regular functions (and multiple dispatch) for most cases

  2. Create macros to reduce boilerplate code that otherwise tricky/ugly to implement with regular functions

Where does generated functions fit? The manual says generated function has the additional benefit of knowing the type of the arguments, so it seems to imply that it’s addressing a gap that macros cannot help. Hence, I would add this as a last resort in my toolkit:

  1. Use generated functions when even macros cannot help with my situation.

I have a hard time coming up with a good use case, however. Does anyone have any concrete example where it’s tough/ugly to use regular functions and macros and it would be much nicer (or more performant) to code as generated functions?

Thanks


#2

It mostly comes in handy when you have a very large (or infinite) number of parametric types for which you want to have really performant code. For instance, StaticArrays.StaticVector{N, T} is a vector of length N. To sum all the N elements, you could write

function sum(vec::StaticVector)
   res = 0
   for i in 1:length(vec)
      res += vec[i]
   end
   res
end

but looping actually has a (very small) cost. This code would be more efficient:

sum(vec::StaticVector{1}) = vec[1]
sum(vec::StaticVector{2}) = vec[1] + vec[2]
sum(vec::StaticVector{3}) = vec[1] + vec[2] + vec[3]
...

It’s painful to write that code manually. A macro would help, but since N is potentially infinite, you can’t cover all cases. You should use a generated function, so that Julia generates the above code automatically, but only for those types that are actually used in your program.

In my research, I use formal logic, with types like Add{Mul{Int, Variable}, Int}. Generated functions allow me to do a big part of unification (i.e. symbolic manipulations) at compile-time instead of runtime.


#3

Note that for cases like this LLVM will happily unroll the loop for you since the loop length is statically known. With (almost) exactly your function:

julia> @code_llvm sum(rand(SVector{3}))

define double @julia_sum_62699(%SArray* nocapture readonly dereferenceable(24)) #0 !dbg !5 {
pass.2:
  %1 = getelementptr %SArray, %SArray* %0, i64 0, i32 0, i64 0
  %2 = load double, double* %1, align 8
  %3 = fadd double %2, 0.000000e+00
  %4 = getelementptr %SArray, %SArray* %0, i64 0, i32 0, i64 1
  %5 = load double, double* %4, align 8
  %6 = fadd double %3, %5
  %7 = getelementptr %SArray, %SArray* %0, i64 0, i32 0, i64 2
  %8 = load double, double* %7, align 8
  %9 = fadd double %6, %8
  ret double %9
}

#4

That’s great! The number of use-cases for generated functions are indeed diminishing as Julia’s plain-old functions improve and we learn to compose things better. I think a great example here is multidimensional iteration… particularly if you’re interested (or will indulge me) in a walk through a bit of Julia’s evolution.

One limitation of Julia’s plain-old-functions (especially in pre-historic times) is that used to be difficult to express multidimensional iteration over an arbitrary N-dimensional array without leaning on linear indexing (which might be slow). There’s no builtin syntax for expressing an arbitrary number of for loops. You can manually say:

sum(v::Vector{Int}) = (r=0; for i in 1:size(v, 1); r += v[i]; end; r)
sum(A::Matrix{Int}) = (r=0; for j in 1:size(A, 2); for i in 1:size(A, 1); r += A[i,j]; …
sum(A::Array{Int,3}) = (r=0; for k in 1:size(A, 3); for j in 1:size(A, 2); for i in 1:size(A, 1); r += A[i,j,k]; …
sum(A::Array{Int,4}) = (r=0; for l in 1:size(A, 4); for k in 1:size(A, 3); for j in 1:size(A, 2); for i in 1:size(A, 1); r += A[i,j,k,l]; …

But that’s awful and tedius and terrible to maintain. Julia has a really good set of metaprogramming tools, though, and so we could instead do something like:

for N in 1:4
    @eval sum(A::Array{Int,$N}) = (r=0; $(construct_n_for_loops(N, …)))
end

That’s still really annoying to read and write, and it’s still hard to write that construct_n_for_loops AST generation method — which was commonly needed. So a very clever person came along and expressed that all as a clever set of macros instead (yes, I’m intentionally linking to a really old doc):

@ngenerate N Int function sum{N}(A::Array{Int,N})
    r = 0
    @nloops N i A begin
        r += @nref N A i
    end
    r
end

That old 0.3 syntax said to generate those four methods for A::Array{Int, N} with N for loops and indexed into A with N indices. The @nloops macro creates those loops, using variables i_1, i_2, etc, as indices structured to access all the dimensions of array A. It still exists — check out what it does for:

using Base.Cartesian
@macroexpand @nloops 4 i A begin
    r += @nref 4 A i
end

Later on, Julia introduced generated functions. This meant that we no longer needed to only generate a subset of these methods — Julia could automatically generate whatever it needed on demand, based purely upon the types of the arguments. We reimplemented all these multidimensional structures as generated functions instead and deprecated @ngenerate (this is now 0.6 syntax):

@generated function sum(A::Array{Int})
    N = ndims(A)
    quote
        r = 0
        @nloops $N i A begin
            r += @nref $N A i
        end
    end
    r
end

You’re probably now looking at me like a crazy person, since, why didn’t we just use a normal function with just one loop for i in eachindex(A)? Precisely. It took some time to realize that generated functions could be used to implement those fast cartesian iterators. It took even more time to realize that we didn’t need to use generated functions to implement those iterators — inlining alone was a powerful enough tool. This story is long enough, though, the power of inlining “lispy” recursive functions in this example can be another story for another day.

This got long, but I hope it gives you a sense of how generated functions can be a form of very powerful “on-demand” code generation… but as Julia’s native codegen improves you don’t need to do this as often. I believe the most common use of generated functions now is just to do compile-time processing of type parameters to preserve type-stability.


#5

Writing linear algebra routines for static arrays is something that I still feel I need to use generated functions for. The reason being that the resulting tuple has to be created “in one shot”. Something like https://github.com/JuliaLang/julia/pull/21912 would probably help with that.