How to apply a small anonymous function defined in a yaml file?

sylvaticus · March 12, 2025, 3:09pm

I am trying to define simple anonimous “transformation” functions (something like “x → x/100”) in a yaml file (in a dictionary keyed by climatic variables) and then trying to apply the transformation to a Raster file. I use this to align different data sources to the same measure unit.

The following works when I prototypes:

tr_fs_h = clim_settings["transformations_h"] # from a yaml file, 
v = "temperatures"
tr_f_str = (tr_fs_h == nothing) ? "identity" : get(tr_fs_h,v,"identity")
tr_f = eval(Meta.parse(tr_f_str))
orig_raster   .= tr_f.(orig_raster) # apply the transformation

However, when I put the code in the actual function I got the following error:

ERROR: MethodError: no method matching (::GenFSM.Res_fr.var"#20#21")(::Int32)
The function `#20` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  (::GenFSM.Res_fr.var"#20#21")(::Any) (method too new to be called from this world context.)

With hindsight, it’s clear that the function is compiled when it doesn’t yet have the information of the exact “transformation function” to apply.
How do you act in these cases?

GunnarFarneback · March 12, 2025, 3:16pm

The most straightforward way is to use invokelatest and accept some performance loss.

abraemer · March 12, 2025, 9:45pm

Alternatively, you can use a function barrier I think. So just create the anonymous function like you do currently and then pass it to another function as argument to perform the transformation. This will cause a dynamic dispatch+recompilation butis likely a bit faster (except for trivial amounts of calls to the anonymous function)

Sevi · March 12, 2025, 10:17pm

Not in this case I think (but please correct me if I’m wrong, anyone). These kind of world-age issues can only be resolved by “hitting the top level” at some point, such that the function using the newly parsed f can actually compile it with the knowledge of what f is/does, to my understanding. A function barrier would only shift the problem one function call lower, as in the following example.

In a sense you want the “opposite” of a function barrier: the control flow should exit the function call stack to the topmost level where it’s clear which methods should be used where and then descend again into calling something with the newly evaluated piece of code.

julia> function foo(functionString)
           x = 1:10
           f = eval(Meta.parse(functionString))
           # `map` already acts as a function barrier here, in some sense
           return map(f, x)
       end

julia> foo("x -> x / 100")

ERROR: MethodError: no method matching (::var"#13#14")(::Int64)

Closest candidates are:
  (::var"#13#14")(::Any) (method too new to be called from this world context.)
   @ Main none:1

But running it twice works (if it’s not an anonymous function, so after foo runs once, myfunction is known everywhere and running foo again works, but it would of course use the one defined the previous time… so this isn’t really a solution).

julia> foo("myfunction(x) = x/100")

ERROR: MethodError: no method matching myfunction(::Int64)

Closest candidates are:
  myfunction(::Any) (method too new to be called from this world context.)
   @ Main none:1

julia> foo("myfunction(x) = x")
10-element Vector{Float64}:
 0.01
 0.02
 0.03
 0.04
 0.05
 0.06
 0.07
 0.08
 0.09
 0.1

The compiler tries to compile foo, but the method for f only gets created at foos runtime. As far as I understand it, this is no fundamental issue [EDIT:] if you give up on compiling foo ahead of time, see below (that’s why invokelatest works and does the right thing here), but it’s some kind of performance and/or correctness issue (?).

So the only options I’m aware of are restructuring the code such that the relevant functions are loaded/evaluated first and used only after hitting the top level once, or as @GunnarFarneback suggested, using invokelatest and living with some performance hits.

Benny · March 13, 2025, 12:32am

It’s fundamental to compilation of a method before its executions, as sylvaticus mentioned. If the method eval-uates another method, the first method’s compilation has no idea about the second method because it didn’t execute the eval call yet. The compiler can’t infer or optimize a call of a method it doesn’t know, so methods use fixed world ages to allow compiler optimizations at the cost of needing invokelatest to force calls to use the most recent world age in the very rare, unoptimizably dynamic cases where eval increments world ages. abraemer’s dynamic dispatch idea is an attempt to emulate invokelatest without the trouble of manually putting it whereever calls are, but dynamic dispatch isn’t a language-level option and evidently still uses obsolete world ages.

Can RuntimeGeneratedFunctions.jl help dodge eval’s world-age limitation here?

julia> using RuntimeGeneratedFunctions

julia> RuntimeGeneratedFunctions.init(@__MODULE__)

julia> function callonce!(anonfunc::Expr)
         @RuntimeGeneratedFunction(anonfunc)()
       end
callonce! (generic function with 1 method)

julia> callonce!(:(() -> 1))
1

julia> callonce!(:(() -> cos(pi)))
-1.0

It works by inserting the function body into a preexisting generated function instead of defining another function or method. That comes with a whole host of limitations, and the output of the call still can’t be inferred in the scope the callable was instantiated, but it sounds fitting for emulating anonymous functions with single methods and does have optimized performance in dynamically dispatched higher order function calls like abraemer intended.

Sevi · March 13, 2025, 6:43am

That makes sense, thanks for the clarification!

Benny · March 13, 2025, 11:02am

It’s also worth pointing out that what I’ve written about before only applies if we’re strictly stuck inside the method call that eval-uated or generates another method to be called, which might be unavoidable. But whenever we make it back to the global scope, we can start again at an updated world age. We don’t even have to complete the first method call, eval can take us there, though I don’t see much of a difference from invokelatest besides the extra work of interpolating values from the method scope into yet another expression.

julia> using RuntimeGeneratedFunctions; RuntimeGeneratedFunctions.init(@__MODULE__)

julia> function makeandcall(func::Expr, func2::Expr, func3::Expr)
           f = eval(func)
           input = [1 2 3]
           output1 = @invokelatest broadcast(f, input) # call anticipates world age issue
           f2 = @RuntimeGeneratedFunction(func2) # callable evades world age issue
           output2 = broadcast(f2, input)
           f3 = eval(func3)
           output3 = @eval let; broadcast($f3, $input) end # updated world age at global scope
           output1, output2, output3
       end
makeandcall (generic function with 1 method)

julia> makeandcall(:(x -> x^2), :(x -> x/10), :(x -> x%2))
([1 4 9], [0.1 0.2 0.3], [1 0 1])

Despite the outputs being uninferrable in any case, that’s fine if we can put the real work elsewhere via function barriers or another eval.

Mason · March 13, 2025, 11:49am

Using eval and invokelatest or a RuntimeGeneratedFunction here is super inefficient if you’re only going to evaluate the function once.

Rather than wasting a bunch of time compiling a function at runtime only to throw it away, I think a better solution would be to use an interpreter to evaluate the code in the string.

Unfortunately, JuliaInterpreter.jl does not seem to have any nice off-the-shelf way to do this without wasting a bunch of time creating MethodInstances and whatnot, but it’s pretty easy to hack together a solution with it that beats eval + invokelatest by an order of magnitude, ~~and nearly two orders of magnitude faster than a RuntimeGeneratedFuncton~~:

using JuliaInterpreter, ExprTools, RuntimeGeneratedFunctions
RuntimeGeneratedFunctions.init(@__MODULE__)

function apply_eval(s::String, x)
    f = eval(Meta.parse(s))
    @invokelatest f(x)
end

function apply_rgf(s::String, x)
    f = @RuntimeGeneratedFunction(Meta.parse(s))
    f(x)
end

function apply_interp(s::String, x)
    ex = Meta.parse(s)
    d = splitdef(ex)
    fr = Frame(@__MODULE__(), :())
    ex = quote
        $(only(d[:args])) = $x
        $(d[:body])
    end
    JuliaInterpreter.eval_code(fr, ex)
end

and then

julia> let s = "x -> x/100"
           x = 1.0
           
           @btime apply_eval($s, $x)
           @btime apply_rgf($s, $x)
           @btime apply_interp($s, $x)
       end
  2.654 ms (1057 allocations: 49.41 KiB)
  10.209 μs (151 allocations: 7.66 KiB) 
  236.699 μs (316 allocations: 13.38 KiB)
0.01

EDIT:

As was pointed out, I misread the 10.209 μs as 10.209 ms. However, this performance advantage for RuntimeGeneratedFunctions wasn’t real, since RuntimeGeneratedFunctions ‘remembers’ the expression it was passed, causing it to cheat the benchmark.

A better benchmark is shown in How to apply a small anonymous function defined in a yaml file? - #11 by Mason, where I show that a RuntimeGeneratedFunction is about as slow as regular eval/invokelatest here.

Benny · March 13, 2025, 12:06pm

Just checking, the benchmark looks like RuntimeGeneratedFunction > interpreter > invokelatest instead:

Mason:

           @btime apply_eval($s, $x)
           @btime apply_rgf($s, $x)
           @btime apply_interp($s, $x)
       end
  2.654 ms (1057 allocations: 49.41 KiB)
  10.209 μs (151 allocations: 7.66 KiB)
  236.699 μs (316 allocations: 13.38 KiB)

Mason · March 13, 2025, 12:10pm

Oh my, I misread the units yeah. That’s quite interesting, I wonder what RuntimeGeneratedFunctions is doing to make that so efficient.

I kinda suspect that there’s some caching involved here. Will investigate.

Mason · March 13, 2025, 12:20pm

Update: yes it seems that the difference was that RuntimeGeneratedFunctions.jl was caching the expressions it evaled, making it seem faster than it really was in a @btime loop.

Here’s a macrobenchmark that avoids the caching:

function bench(f, x)
    ops = ["/", "*", "+", "-"]
    vals = rand(5)
    for op1 ∈ ops, op2 ∈ ops
        for y ∈ vals, z ∈ vals
            s = "(x) -> x $op1 $y $op2 $z" 
            f(s, x)
        end
    end
end

julia> @time bench(apply_eval, 1.0)
  1.077235 seconds (449.57 k allocations: 20.770 MiB, 82.07% compilation time)

julia> @time bench(apply_rgf, 1.0)
  1.115373 seconds (976.42 k allocations: 45.832 MiB, 0.66% gc time, 97.86% compilation time)

julia> @time bench(apply_interp, 1.0)
  0.111546 seconds (138.85 k allocations: 6.111 MiB)

Benny · March 13, 2025, 12:20pm

init indeed makes a module-wise cache for method bodies that are inserted into a module-wise generated method, and my guess is that for most of your benchmark, the string is being processed to the same RuntimeGeneratedFunction instance that calls an already compiled generated method.

Topic		Replies	Views
Functions in an input file-- how? General Usage	35	2567	May 16, 2017
Defining function from string with parse and eval failure in v0.6 master General Usage	4	804	April 18, 2017
Anonymous functions in Dictionary with generator syntax General Usage	3	845	September 28, 2018
( (x,y) -> x+y ).( [ (1,2), (3,4) ] ) General Usage tuple , vector , functions	8	592	September 8, 2021
How to add (anonymous) functions? General Usage question	18	949	November 27, 2018

How to apply a small anonymous function defined in a yaml file?

Related topics