Surprising runtime behaviour when wrapping functions

Running the following test program

using BenchmarkTools

f(x) = x

struct Wrapper
    f::Function
end

function evaluate(f, x)
    return f(x)
end
function evaluate(wrapper::Wrapper, x)
    return wrapper.f(x)
end

function test(f, n)
    [evaluate(f, x) for x in 1:n]
end

w = Wrapper(f)
global n
for n in [10, 100, 1000, 10000, 100000]
    test(f, n)
    @time test(f, n)

    test(w, n)
    @time test(w, n)
end
println()

@btime test(f, 10)
@btime test(w, 10)
@btime test(f, 100)
@btime test(w, 100)
@btime test(f, 1000)
@btime test(w, 1000)
@btime test(f, 10000)
@btime test(w, 10000)
@btime test(f, 100000)
@btime test(w, 100000)
println()

with Julia 1.6.1 prints

  0.000000 seconds (1 allocation: 160 bytes)
  0.000012 seconds (3 allocations: 224 bytes)
  0.000000 seconds (1 allocation: 896 bytes)
  0.000002 seconds (3 allocations: 960 bytes)
  0.000001 seconds (1 allocation: 7.938 KiB)
  0.000022 seconds (493 allocations: 15.656 KiB)
  0.000016 seconds (2 allocations: 78.203 KiB)
  0.000258 seconds (9.49 k allocations: 226.547 KiB)
  0.000218 seconds (2 allocations: 781.328 KiB)
  0.002767 seconds (99.49 k allocations: 2.281 MiB)

  27.968 ns (1 allocation: 160 bytes)
  444.949 ns (3 allocations: 224 bytes)
  56.809 ns (1 allocation: 896 bytes)
  2.180 μs (3 allocations: 960 bytes)
  482.105 ns (1 allocation: 7.94 KiB)
  21.300 μs (492 allocations: 15.64 KiB)
  5.140 μs (2 allocations: 78.20 KiB)
  208.100 μs (9493 allocations: 226.53 KiB)
  43.600 μs (2 allocations: 781.33 KiB)
  2.079 ms (99493 allocations: 2.28 MiB)

Is this expected behaviour? Is this a mistake on my side? Can we improve this somehow?

Yes, it’s expected. The f::Function is an abstract type. You’ll end up with dynamic dispatch when calling wrapper.f(x) as shown below:

julia> @code_warntype evaluate(f, 10)
Variables
  #self#::Core.Const(evaluate)
  f::Core.Const(f)
  x::Int64

Body::Int64
1 ─ %1 = (f)(x)::Int64
└──      return %1

julia> @code_warntype evaluate(Wrapper(f), 10)
Variables
  #self#::Core.Const(evaluate)
  wrapper::Wrapper
  x::Int64

Body::Any
1 ─ %1 = Base.getproperty(wrapper, :f)::Function
│   %2 = (%1)(x)::Any
└──      return %2

Note the ::Any annotation is the second result.

You could define your Wrapper type as

struct Wrapper{F<:Function}
    f::F
end

to ensure it’s a concrete type rather than abstract.

Thank you very much for your excellent answer. FYI, I was orienting towards NLSolversBase.jl

mutable struct OnceDifferentiable{TF, TDF, TX} <: AbstractObjective
    f # objective
    df # (partial) derivative of objective
    fdf # objective and (partial) derivative of objective
    F::TF # cache for f output
    DF::TDF # cache for df output
    x_f::TX # x used to evaluate f (stored in F)
    x_df::TX # x used to evaluate df (stored in DF)
    f_calls::Vector{Int}
    df_calls::Vector{Int}
end

which doesn’t even annotate the functions.

It’s not always the case that restricting the types of fields to something concrete will improve the performance of code, it usually depends on how it’s being used. Over-parameterising types can sometimes lead to strain on the compiler since passing a different Function as .f might need to recompile a bunch of new code.

It may be that in OnceDifferentiable it doesn’t actually matter, or having an untyped field was more useful. Only the authors of that code would be able to provide a definite answer. Usually best to benchmark those kind of decisions if there’s some doubt.

1 Like

Thank you again for your explanations. This forced me to check NLSolversBase. My findings are documented at Surprising runtime behaviour with objective types · Issue #142 · JuliaNLSolvers/NLSolversBase.jl · GitHub.

One last remark: it seems parametricity already does the job. Performance seems fine for

struct Wrapper{F}
    f::F
end