Performance of collections with Functions as abstract type

I know from the docs I should avoid containers with abstract types, however I am running into a couple of use cases where I have tag calculation objects with arbitrary functions.

Base.@kwdef struct TagCalculation{T<:Function}
    name :: String
    calc :: T
    tags :: Vector{String}
end

However, I will have a vector of these and they will have different calculation types. I’ve noticed that if I have a vector of sin functions, I get this output

vsin = fill(sin, 100000)
100000-element Vector{typeof(sin)}:

If I change it so that one of them is a cos, I get

vf = [fill(sin, 99999);cos]
100000-element Vector{Function}:

So it automatically promotes to the type “Function”. Next if I try evaluating this on a vector of random numbers

x = randn(100000)
@time for ii in eachindex(vsin)
       vsin[ii](x[ii])
       end
  0.012697 seconds (498.98 k allocations: 9.140 MiB)

@time for ii in eachindex(vf)
       vf[ii](x[ii])
       end
  0.012592 seconds (498.98 k allocations: 9.140 MiB)

So performance is nearly identical in these cases. So is Function a special kind of abstract type that doesn’t suffer from performance in comparison to its concrete types, or is the performance penalty of abstraction hidden in the cost of something else (like calculating sin)? If I could break down common calculations into say, 20 categories, would it be worth the effort of grouping them into a NamedTuple of vectors (one vector for each calculation category) or would I be okay just lumping them into a single vector and getting their outputs (which are all Float64)?

1 Like

Benchmarking in global scope is not recommended, and it is possible that inefficiencies related to that are obscuring potential differences.

Benchmarking in a local scope shows:

julia> function test(v, x)
           y = similar(x)
           for ii in eachindex(v)
               y[ii] = v[ii](x[ii])
           end
           y
       end;

julia> using BenchmarkTools

julia> @btime test($vsin, $x);
  1.561 ms (2 allocations: 781.30 KiB)

julia> @btime test($vf, $x);
  6.122 ms (299491 allocations: 5.33 MiB)

Interestingly, if it’s known to be a small union of functions, one may use

julia> v2 = convert(Vector{Union{typeof(sin),typeof(cos)}}, vf);

julia> @btime test($v2, $x);
  1.621 ms (2 allocations: 781.30 KiB)

Perhaps there should be some heuristic to use a union as the eltype if there are only a few functions, instead of using the supertype.

2 Likes

Right, thanks for the benchmarking tips! Unfortunately, the union of functions is unknown and would be pretty big in most cases. In one application they might be about 20-50 calculation types, in another application there might be more than a thousand.

That being said, the penalty only seems to be slowing it by a factor of 4 for a basic calculation. The calculations I plan on running are likely going to be more involved and include dictionary lookups, so the real-world impact of putting different functions together like this is probably going to be smaller with a huge benefit for simplification.

You could also consider using FunctionWrappers.jl if you know the signature and input/output types of all the functions.

1 Like

Oh neat! I didn’t know that! I do know all the input and output type signatures (the input is always a single dataframe and the output is always a vector of floats). I don’t see much documentation here. How would I actually use it to wrap a function?

Here’s a simple example. I see you already know that just fun::Function is an abstract type and bad for performance, but just in case somebody else stumbles on this some day down the road:


using FunctionWrappers
import FunctionWrappers: FunctionWrapper

struct BadStruct
  fun::Function
  second_arg::Float64
end

struct GoodStruct
  fun::FunctionWrapper{Float64, Tuple{Float64, Float64}}
  second_arg::Float64
end

evaluate_strfun(str, arg) = str.fun(arg, str.second_arg)

bad_example  = BadStruct(hypot, 1.0)
good_example = GoodStruct(hypot, 1.0)

# If you run these in a REPL that has colors enabled, you'll see more clearly.

@code_warntype evaluate_strfun(bad_example, 1.5)
#=
MethodInstance for evaluate_strfun(::BadStruct, ::Float64)
  from evaluate_strfun(str, arg) in Main at /home/cg/fwexample.jl:15
Arguments
  #self#::Core.Const(evaluate_strfun)
  str::BadStruct
  arg::Float64
Body::Any
1 ─ %1 = Base.getproperty(str, :fun)::Function
│   %2 = Base.getproperty(str, :second_arg)::Float64
│   %3 = (%1)(arg, %2)::Any
└──      return %3
=#

@code_warntype evaluate_strfun(good_example, 1.5)
#=
MethodInstance for evaluate_strfun(::GoodStruct, ::Float64)
  from evaluate_strfun(str, arg) in Main at /home/cg/fwexample.jl:15
Arguments
  #self#::Core.Const(evaluate_strfun)
  str::GoodStruct
  arg::Float64
Body::Float64
1 ─ %1 = Base.getproperty(str, :fun)::FunctionWrapper{Float64, Tuple{Float64, Float64}}
│   %2 = Base.getproperty(str, :second_arg)::Float64
│   %3 = (%1)(arg, %2)::Float64
└──      return %3
=#