Optimal way to broadcast over a container of funcs

Hi there,

I have a vector (more generally: a list) of functions the map a vector to a vector and also a matrix where each row is to be populated by each of the functions. See below an example:

function test()
    f_vec = [x -> x.^2 .+ i for i in 1:10]
    grid = collect(1:12)
    cont = zeros(length(f_vec), length(grid))
    
    for f_ind in 1:length(f_vec)
        cont[f_ind, :] .= f_vec[f_ind](grid)
    end


end

I get the following benchmarking results:

Which not seem ideal. Is there an optimal way of doing so?

Thanks a lot in advance!

You will not get a very meaningful benchmarking result if you access non-const globals inside your function. I would start by making cont and grid arguments to your test function.

1 Like

Yep, I just modified the post to reflect this. Thanks.

Try either of these (the latter using map rather than allocate-and-fill)

function test2()
    f_vec = [x -> x^2 + i for i in 1:10]
    grid = collect(1:12)
    cont = zeros(length(f_vec), length(grid))
    
    for f_ind in 1:length(f_vec)
        cont[f_ind, :] .= f_vec[f_ind].(grid)
    end
    return cont
end

function test3()
    f_vec = [x -> x^2 + i for i in 1:10]
    grid = collect(1:12)

    cont = map(((f,g),) -> f(g), Iterators.product(f_vec, grid))
    return cont
end

Notably, I changed your functions to not operate on vectors. Instead, they operate on scalars and I broadcast (or map) the function itself over the vector.

I’ll also discourage you from using collect in most situations. grid = 1:12 would have worked fine and would save memory and time.

I’ll remark that this is performant because your functions are all related closures. If you instead did a big list of dissimilar functions (e.g., f_vec = [sin, cos, tan, abs, exp] you’d have dynamic dispatch to contend with and a different pattern would be needed.

Keep in mind that you are now also benchmarking the time it takes to build the vector of functions and the vector of function inputs. If you want to time only the application, you should create those outside the test function and pass them as inputs.

BTW, there isn’t really a ‘list’ container that is more general than Vector. The thing that you would call a ‘list’ in e.g. Python, is just a Vector{Any} in Julia.

This looks short and easy to read:

function test4()
    f_vec = [x -> x^2 + i for i in 1:10]
    grid = collect(1.0:1.0:12.0)
    [f(g) for f in f_vec, g in grid]
end

BTW no need to collect the range in grid.

1 Like

Agreed, I just wanted to refer to the fact that it need to be a vector. Just a container or list of some form like a tuple. But thanks for the comment.

Thanks a lot for the suggestions. However, I had the function operating on vectors on purpose since I found that the banchmarking results where better. See the example below:

function test()

    u = zeros(Float64, 30)
    a = [1.0, 0.0, 0.0, 0.0, 0.0]

    function hermite5_cdf_temp(a::AbstractArray{TV, 1}, μ::TV, σ::TV) where TV
        func(x) = σ * Polynomial(a)((x - μ) / σ) * pdf.(Normal(μ, σ), x) / 24.0 + cdf(Normal(μ, σ), x)
        return func
    end

    hermite5_cdf_temp(a, 1.0, 1.0).(u)

end

Which benchmarks to:

and then compare to this, which is the same function but operating on vectors:

function test()

    u = zeros(Float64, 30)
    a = [1.0, 0.0, 0.0, 0.0, 0.0]

    function hermite5_cdf_temp(a::AbstractArray{TV, 1}, μ::TV, σ::TV) where TV
        func(x) = σ * Polynomial(a).((x .- μ) / σ) .* pdf.(Normal(μ, σ), x) / 24.0 .+ cdf.(Normal(μ, σ), x)
        return func
    end

    hermite5_cdf_temp(a, 1.0, 1.0)(u)

end

and benchmarks to:

Therefore I would like to keep it acting on vectors, thanks! Any ideas given this?

In particular, an array/vector of functions is an abstractly typed container, and really inhibits Julia’s type inference. You might be better off with careful use of a tuple of functions (or using fancy tricks like FunctionWrappers.jl), if you don’t want to re-think your whole approach.

1 Like

In Julia the call operation can be overloaded for types, this allows a nice method to get the functionality in the original post:

using Polynomials, Distributions

struct Hermite5
    a :: Vector{Float64}
    σ :: Float64
    μ :: Float64
end

(h::Hermite5)(x) = h.σ * Polynomial(h.a).((x .- h.μ) / h.σ) .* 
  pdf.(Normal(h.μ, h.σ), x) / 24.0 .+ cdf.(Normal(h.μ, h.σ), x)

function test2()
    u = zeros(Float64, 30)
    a = [1.0, 0.0, 0.0, 0.0, 0.0]
    h = Hermite5(a,1.0,1.0)

    h.(u)
end

This method might not only look better, but also perform better because of problems with captured variables in closures when handling functions.

1 Like

It did not perform better but thanks for the comment!

The trick would be to optimize the function application by removing repeated generation/allocation of structures:

struct Hermite6
    a :: Polynomial{Float64, :x}
    D :: Normal{Float64}
    μ :: Float64
    σ :: Float64
    Hermite6(a,μ,σ) = new(Polynomial(a), Normal(μ, σ), μ, σ)
end

(h::Hermite6)(x) = h.σ * (h.a).((x .- h.μ) / h.σ) .* 
  pdf.(h.D, x) / 24.0 .+ cdf.(h.D, x)

function test3()
    u = zeros(Float64, 30)
    a = [1.0, 0.0, 0.0, 0.0, 0.0]
    h = Hermite6(a,1.0,1.0)

    h.(u)
end

In the OP, everytime the function is called a Normal and a Polynomial struct needs to be created (and benchmarking shows about 3x allocation).

1 Like