I know from the docs I should avoid containers with abstract types, however I am running into a couple of use cases where I have tag calculation objects with arbitrary functions.
Base.@kwdef struct TagCalculation{T<:Function}
name :: String
calc :: T
tags :: Vector{String}
end
However, I will have a vector of these and they will have different calculation types. I’ve noticed that if I have a vector of sin functions, I get this output
So it automatically promotes to the type “Function”. Next if I try evaluating this on a vector of random numbers
x = randn(100000)
@time for ii in eachindex(vsin)
vsin[ii](x[ii])
end
0.012697 seconds (498.98 k allocations: 9.140 MiB)
@time for ii in eachindex(vf)
vf[ii](x[ii])
end
0.012592 seconds (498.98 k allocations: 9.140 MiB)
So performance is nearly identical in these cases. So is Function a special kind of abstract type that doesn’t suffer from performance in comparison to its concrete types, or is the performance penalty of abstraction hidden in the cost of something else (like calculating sin)? If I could break down common calculations into say, 20 categories, would it be worth the effort of grouping them into a NamedTuple of vectors (one vector for each calculation category) or would I be okay just lumping them into a single vector and getting their outputs (which are all Float64)?
julia> function test(v, x)
y = similar(x)
for ii in eachindex(v)
y[ii] = v[ii](x[ii])
end
y
end;
julia> using BenchmarkTools
julia> @btime test($vsin, $x);
1.561 ms (2 allocations: 781.30 KiB)
julia> @btime test($vf, $x);
6.122 ms (299491 allocations: 5.33 MiB)
Interestingly, if it’s known to be a small union of functions, one may use
Right, thanks for the benchmarking tips! Unfortunately, the union of functions is unknown and would be pretty big in most cases. In one application they might be about 20-50 calculation types, in another application there might be more than a thousand.
That being said, the penalty only seems to be slowing it by a factor of 4 for a basic calculation. The calculations I plan on running are likely going to be more involved and include dictionary lookups, so the real-world impact of putting different functions together like this is probably going to be smaller with a huge benefit for simplification.
Oh neat! I didn’t know that! I do know all the input and output type signatures (the input is always a single dataframe and the output is always a vector of floats). I don’t see much documentation here. How would I actually use it to wrap a function?
Here’s a simple example. I see you already know that just fun::Function is an abstract type and bad for performance, but just in case somebody else stumbles on this some day down the road:
using FunctionWrappers
import FunctionWrappers: FunctionWrapper
struct BadStruct
fun::Function
second_arg::Float64
end
struct GoodStruct
fun::FunctionWrapper{Float64, Tuple{Float64, Float64}}
second_arg::Float64
end
evaluate_strfun(str, arg) = str.fun(arg, str.second_arg)
bad_example = BadStruct(hypot, 1.0)
good_example = GoodStruct(hypot, 1.0)
# If you run these in a REPL that has colors enabled, you'll see more clearly.
@code_warntype evaluate_strfun(bad_example, 1.5)
#=
MethodInstance for evaluate_strfun(::BadStruct, ::Float64)
from evaluate_strfun(str, arg) in Main at /home/cg/fwexample.jl:15
Arguments
#self#::Core.Const(evaluate_strfun)
str::BadStruct
arg::Float64
Body::Any
1 ─ %1 = Base.getproperty(str, :fun)::Function
│ %2 = Base.getproperty(str, :second_arg)::Float64
│ %3 = (%1)(arg, %2)::Any
└── return %3
=#
@code_warntype evaluate_strfun(good_example, 1.5)
#=
MethodInstance for evaluate_strfun(::GoodStruct, ::Float64)
from evaluate_strfun(str, arg) in Main at /home/cg/fwexample.jl:15
Arguments
#self#::Core.Const(evaluate_strfun)
str::GoodStruct
arg::Float64
Body::Float64
1 ─ %1 = Base.getproperty(str, :fun)::FunctionWrapper{Float64, Tuple{Float64, Float64}}
│ %2 = Base.getproperty(str, :second_arg)::Float64
│ %3 = (%1)(arg, %2)::Float64
└── return %3
=#