# Eliminating allocations with functions in struct

Hello all,

I thought that the following would not result in dynamic dispatch, but I am still seeing allocation:

``````module DispTests

struct ParamFun{N,F}
fₙ::F
pₙ::NTuple{N,Vector{Float64}}
end

ParamFun{N}(f,p) where N = ParamFun{N,typeof(f)}(f,p)

function run_that_fun(f::ParamFun{N,F},x) where {N,F}
r = 0.0
@inbounds for i = 1:N
r += (f.fₙ[i])(x,f.pₙ[i])
end
r
end

function example()
f1(x,p) = abs(x)^p[1] + p[2]
f2(x,p) = cos(p[1]*x) + sin(p[2]*x)
f3(x,p) = x*(p[1]+p[2]+p[3])
return ParamFun{3}((f1,f2,f3),([1.0,0.3],[0.2,1],[1,2,3.0]))
end
end
``````
``````using BenchmarkTools
f = DispTests.example()
@benchmark DispTests.run_that_fun(\$f,2.0)
BenchmarkTools.Trial:
memory estimate:  144 bytes
allocs estimate:  9
--------------
minimum time:     242.228 ns (0.00% GC)
median time:      258.244 ns (0.00% GC)
mean time:        269.538 ns (1.58% GC)
maximum time:     2.345 μs (79.14% GC)
--------------
samples:          10000
evals/sample:     438
``````

Can I do this without using `FunctionWrappers`? If not, how would one go about using function wrappers for this, I can’t figure it out…

Thanks,
Jeremy

1 Like

The issue is in

``````    @inbounds for i = 1:N
r += (f.fₙ[i])(x,f.pₙ[i])
end
``````

`f.fₙ[i]` and `f.pₙ[i]` will have different types in each iteration, making the code type-unstable, as witnessed by

``````@code_warntype DispTests.run_that_fun(f, 2.0)
``````

If you’re trying to avoid all allocations, the first thing to check for should always be type stability.

One way to avoid the allocations is by using tuple recursion:

``````import Base: tail

paramfunsum(f::Tuple{}, x, p::Tuple{}) = 0 # base case
paramfunsum(f::Tuple, x, p::Tuple) = f[1](x, p[1]) + paramfunsum(tail(f), x, tail(p))

function run_that_fun(f::ParamFun{N,F},x) where {N,F}
paramfunsum(f.fₙ, x, f.pₙ)
end
``````

which results in

``````BenchmarkTools.Trial:
memory estimate:  0 bytes
allocs estimate:  0
--------------
minimum time:     42.578 ns (0.00% GC)
median time:      44.061 ns (0.00% GC)
mean time:        46.218 ns (0.00% GC)
maximum time:     163.997 ns (0.00% GC)
--------------
samples:          10000
evals/sample:     990
``````

Note that this relies on `paramsumfun` being inlined; in some cases you may have to force that using `@inline`. There is also a limit on how long the tuples can be before the compiler will give up and start allocating. If you’re running into that limit, you could use `@generated` functions instead, but the type of recursion above is typically preferred if possible.

5 Likes