What do generated functions actually disallow?

The Manual states the following restrictions:

  1. Generated functions are only permitted to call functions that were defined before the definition of the generated function. (Failure to follow this may result in getting MethodErrors referring to functions from a future world-age.)
  2. Generated functions must not mutate or observe any non-constant global state (including, for example, IO, locks, non-local dictionaries, or using hasmethod). This means they can only read global constants, and cannot have any side effects. In other words, they must be completely pure. Due to an implementation limitation, this also means that they currently cannot define a closure or generator.

I can’t observe world age issues if I put a yet-to-be-defined function in the generated code:

julia> @generated foo(A) = :(bar(A))
foo (generic function with 1 method)

julia> bar(x) = x
bar (generic function with 1 method)

julia> foo([1,2])
2-element Vector{Int64}:
 1
 2

In practice, they seem to be used to mutate input array arguments e.g. RuntimeGeneratedFunctions.jl README, and I can’t trigger the purity error if I mutate an input array or a global array. I need to write a nested function to trigger the error at compilation, and it doesn’t even need to involve any globals or mutation. Is this undefined behavior?

julia> @generated zerofirst1(A) = :(A[1] = zero(eltype(A)); A) # mutate input array
zerofirst1 (generic function with 1 method)

julia> zerofirst1([1,2])
2-element Vector{Int64}:
 0
 2

julia> B::Vector{Int} = [0]
1-element Vector{Int64}:
 0

julia> @generated zerofirst2(A) = :(A[1] = B[1]; A) # also access global array
zerofirst2 (generic function with 1 method)

julia> zerofirst2([1,2])
2-element Vector{Int64}:
 0
 2

julia> @generated writeB(A) = :(B[1] = A[1]; B) # mutate global array
writeB (generic function with 1 method)

julia> writeB([1,2])
1-element Vector{Int64}:
 1

julia> @generated getB(A) = :((x->B)(A)) # nested function accesses, not captures, global
getB (generic function with 1 method)

julia> getB([1,2])
ERROR: The function body AST defined by this @generated function is not pure. This likely means it contains a closure, a comprehension or a generator.
Stacktrace:
 [1] top-level scope
   @ REPL[29]:1

julia> @generated donothing(A) = :((x->x)(A)) # nested function does not capture
donothing (generic function with 1 method)

julia> donothing([1,2])
ERROR: The function body AST defined by this @generated function is not pure. This likely means it contains a closure, a comprehension or a generator.
Stacktrace:
 [1] top-level scope
   @ REPL[39]:1

Your bar call is part of the generated code; that’s OK. What’s not is for the body of the @generated definition to call functions that are defined later, like the bar call here:

julia> @generated foo(A) = bar(A)
foo (generic function with 1 method)

julia> bar(x) = :(println($x))
bar (generic function with 1 method)

julia> foo([1,2])
ERROR: MethodError: no method matching bar(::Type{Vector{Int64}})
The applicable method may be too new: running in world age 26741, while current world is 26742.

Incidentally this also means that even if you fix this by defining bar first, Revising bar won’t work the way you might expect, for the same reason that Changes to macros are ignored · Issue #20 · timholy/Revise.jl. There’s no backedge.

5 Likes

This is similar; it’s fine to generate code that mutates, but not fine to have the generator itself mutate globals and such. This restriction is probably UB, in the sense that the Julia devs don’t want to specify how many times the generator will be called (leaving room for compiler improvements that change this in the future).

2 Likes

It makes sense that side effects are heavily restricted in the compile-time phase that constructs the returned expression. Is it accurate to summarize that the only restriction inside the returned expression is no nested functions, including closures and comprehensions, and every other restriction is for the @generated method body that is executed prior to returning that expression?

I think so, but I wouldn’t be surprised to hit other issues. Eg. Allow return type declaration for generated function · Issue #21322 · JuliaLang/julia.

I’m satisfied enough with the Manual just being poorly worded but otherwise correct to close this topic for now, and it’s consistent with how generated functions are used in practice. More comments or a nearly duplicate topic are welcome.