Performance and allocation issue with arrays of functions (v1.5)

Elrod · August 17, 2020, 2:44am

Sure, but @generated is easy and it performs well. Comparing the above compute_values2 with the generated version:

julia> @btime compute_values2($x₀, $Δt, $nΔt, $[f1])
  1.665 μs (1 allocation: 896 bytes)
 1×100 Matrix{Float64}:
 0.948985  0.999966  0.927798  0.64436  0.0897141  -0.623416  -0.998433  -0.317148  0.919731  …  0.265737  -0.944011  0.0275612  -0.999406  0.342549  0.939933  -0.676239  0.599191

julia> @btime compute_values2($x₀, $Δt, $nΔt, $[f2])
  231.457 ns (1 allocation: 896 bytes)
 1×100 Matrix{Float64}:
 1.5625  2.44141  3.8147  5.96046  9.31323  14.5519  22.7374  35.5271  55.5112  86.7362  135.525  …  1.65608e18  2.58763e18  4.04317e18  6.31746e18  9.87103e18  1.54235e19  2.40992e19

julia> @btime compute_values2($x₀, $Δt, $nΔt, $(f1,f2))
  1.726 μs (1 allocation: 1.77 KiB)
 2×100 Matrix{Float64}:
 0.948985  0.999966  0.927798  0.64436  0.0897141  -0.623416  -0.998433  -0.317148   0.919731  …  -0.944011    0.0275612   -0.999406    0.342549    0.939933    -0.676239    0.599191
 1.5625    2.44141   3.8147    5.96046  9.31323    14.5519    22.7374    35.5271    55.5112        1.65608e18  2.58763e18   4.04317e18  6.31746e18  9.87103e18   1.54235e19  2.40992e19

julia> @btime compute_values_generated($x₀, $Δt, $nΔt, $(f1,))
  1.506 μs (1 allocation: 896 bytes)
 1×100 Matrix{Float64}:
 0.948985  0.999966  0.927798  0.64436  0.0897141  -0.623416  -0.998433  -0.317148  0.919731  …  0.265737  -0.944011  0.0275612  -0.999406  0.342549  0.939933  -0.676239  0.599191

julia> @btime compute_values_generated($x₀, $Δt, $nΔt, $(f2,))
  220.222 ns (1 allocation: 896 bytes)
 1×100 Matrix{Float64}:
 1.5625  2.44141  3.8147  5.96046  9.31323  14.5519  22.7374  35.5271  55.5112  86.7362  135.525  …  1.65608e18  2.58763e18  4.04317e18  6.31746e18  9.87103e18  1.54235e19  2.40992e19

julia> @btime compute_values_generated($x₀, $Δt, $nΔt, $(f1,f2))
  1.550 μs (1 allocation: 1.77 KiB)
 2×100 Matrix{Float64}:
 0.948985  0.999966  0.927798  0.64436  0.0897141  -0.623416  -0.998433  -0.317148   0.919731  …  -0.944011    0.0275612   -0.999406    0.342549    0.939933    -0.676239    0.599191
 1.5625    2.44141   3.8147    5.96046  9.31323    14.5519    22.7374    35.5271    55.5112        1.65608e18  2.58763e18   4.04317e18  6.31746e18  9.87103e18   1.54235e19  2.40992e19

Although if you’re unhappy with laziness as an excuse, with a little more work, we can use dispatch instead of Base.Cartesian.@nexprs to unroll our expressions:

@inline fmap(fs::Tuple, x) = (first(fs)(x), fmap(Base.tail(fs), x)...)
@inline fmap(fs::Tuple{T}, x) where {T} = (first(fs)(x), )
function compute_values_map(x₀, Δt, nΔt, f::Tuple{Vararg{<:Any,K}}) where {K}
    x = x₀;
    f_vals = zeros(K, nΔt)
    for n in 1:nΔt
        x += 0.5 * Δt * x
        fv = fmap(f, x)
        for k in 1:K
            f_vals[k,n] = fv[k]
        end
    end
    return f_vals
end

Result:

julia> @btime compute_values_map($x₀, $Δt, $nΔt, $(f1,f2))
  1.562 μs (1 allocation: 1.77 KiB)
 2×100 Matrix{Float64}:
 0.948985  0.999966  0.927798  0.64436  0.0897141  -0.623416  -0.998433  -0.317148   0.919731  …  -0.944011    0.0275612   -0.999406    0.342549    0.939933    -0.676239    0.599191
 1.5625    2.44141   3.8147    5.96046  9.31323    14.5519    22.7374    35.5271    55.5112        1.65608e18  2.58763e18   4.04317e18  6.31746e18  9.87103e18   1.54235e19  2.40992e19

Topic		Replies	Views
Array of functions - is there a way to avoid allocations performance penalty? Performance memory-allocation , arrays	23	2361	May 23, 2019
A curious case of allocating behavior General Usage performance	2	971	May 24, 2017
Array of functions and allocation General Usage	1	1619	April 13, 2019
Multithreading and generated code Performance multithreading	5	442	May 9, 2021
Efficient Evaluation of a Matrix of Function Products Performance	8	340	June 30, 2022

Performance and allocation issue with arrays of functions (v1.5)

Related topics