Edited version of my initial post based upon the response from @Raf.
After I wrote this post (1) I got advice on how to write a better post. I have taken that advice below, and (2) I actually found the answer to my post by looking at Understanding the allocation behavior on arrays vs scalars - #2 by stevengj . Thus, I am editing the post to include the answer, in case others have the same question.
I want to maximize performance when I call functions that are array elements. My underlying motivation is that I plan to write income tax simulation models (tax calculators) for tax systems in ~41 states. The models will run on microdata files that represent taxpayers in the states. I want the models to be as parameter-driven as possible, where functions that do the necessary tax calculations are stored in arrays. The actual functions, the order in which they are processed, and their arguments, will vary from state to state, and from policy scenario (potential tax law) to policy scenario, defined in json files. It is a variant of and inspired by work being done here.
I was concerned because when I tested the idea of putting functions into arrays, performance degraded substantially. I have since discovered from the post linked above that this problem can be solved by using StaticArrays.
The remainder of this post shows the problem, and the solution.
The problem
The example below first defines consts for two functions (sin, cos) separately, and then defines a const array of these two functions. Then I call each of the two functions 10^6 times, two different ways: (1) calling the consts, and (2) calling the array elements. (I have constructed the example this way to isolate the issue, so that we can see - I believe - that it results solely from using an array of functions.)
The first approach has 13.5k memory allocations on the first run and only 6 allocations on the second run (6 allocations for @time I believe).
The second approach (array elements) has 6 M allocations on each call (and takes far more time when scaled up) and so is much less efficient. Because this is an approach I would like to be able to use, my initial question was, is there a way to make it more efficient?
const fvar1 = sin
const fvar2 = cos
const fvec = [sin, cos]
function fun_vars(n::Int64)
t = 0.0
for i = 1:n
t += fvar1(i)
t += fvar2(i)
end
return t
end
function fun_vec(n::Int64)
t = 0.0
for i = 1:n
t += fvec[1](i)
t += fvec[2](i)
end
return t
end
@time fun_vars(10^6) # 13.5k allocations
@time fun_vars(10^6) # 6 allocations on 2nd time
@time fun_vec(10^6) # 6 M allocations
@time fun_vec(10^6) # 6 M allocations on 2nd time
The solution
It turns out there is a way to make this more efficient. A solution - perhaps the best solution, I do not know - is to use StaticArrays, as shown below.
# Pkg.add("StaticArrays")
using StaticArrays
const fvec2 = SVector(sin, cos)
function fun_vec2(n::Int64)
t = 0.0
for i = 1:n
t += fvec2[1](i)
t += fvec2[2](i)
end
return t
end
@time fun_vec2(10^6) # 16.6k allocations
@time fun_vec2(10^6) # 6 allocations on 2nd time
I hope this is helpful to someone.