Optimal way to broadcast over a container of funcs

miguelborrero · June 24, 2024, 5:49pm

Hi there,

I have a vector (more generally: a list) of functions the map a vector to a vector and also a matrix where each row is to be populated by each of the functions. See below an example:

function test()
    f_vec = [x -> x.^2 .+ i for i in 1:10]
    grid = collect(1:12)
    cont = zeros(length(f_vec), length(grid))
    
    for f_ind in 1:length(f_vec)
        cont[f_ind, :] .= f_vec[f_ind](grid)
    end


end

I get the following benchmarking results:

Which not seem ideal. Is there an optimal way of doing so?

Thanks a lot in advance!

GunnarFarneback · June 24, 2024, 5:59pm

You will not get a very meaningful benchmarking result if you access non-const globals inside your function. I would start by making cont and grid arguments to your test function.

miguelborrero · June 24, 2024, 6:06pm

Yep, I just modified the post to reflect this. Thanks.

mikmoore · June 24, 2024, 6:25pm

Try either of these (the latter using map rather than allocate-and-fill)

function test2()
    f_vec = [x -> x^2 + i for i in 1:10]
    grid = collect(1:12)
    cont = zeros(length(f_vec), length(grid))
    
    for f_ind in 1:length(f_vec)
        cont[f_ind, :] .= f_vec[f_ind].(grid)
    end
    return cont
end

function test3()
    f_vec = [x -> x^2 + i for i in 1:10]
    grid = collect(1:12)

    cont = map(((f,g),) -> f(g), Iterators.product(f_vec, grid))
    return cont
end

Notably, I changed your functions to not operate on vectors. Instead, they operate on scalars and I broadcast (or map) the function itself over the vector.

I’ll also discourage you from using collect in most situations. grid = 1:12 would have worked fine and would save memory and time.

I’ll remark that this is performant because your functions are all related closures. If you instead did a big list of dissimilar functions (e.g., f_vec = [sin, cos, tan, abs, exp] you’d have dynamic dispatch to contend with and a different pattern would be needed.

DNF · June 24, 2024, 6:30pm

Keep in mind that you are now also benchmarking the time it takes to build the vector of functions and the vector of function inputs. If you want to time only the application, you should create those outside the test function and pass them as inputs.

BTW, there isn’t really a ‘list’ container that is more general than Vector. The thing that you would call a ‘list’ in e.g. Python, is just a Vector{Any} in Julia.

Dan · June 24, 2024, 6:37pm

This looks short and easy to read:

function test4()
    f_vec = [x -> x^2 + i for i in 1:10]
    grid = collect(1.0:1.0:12.0)
    [f(g) for f in f_vec, g in grid]
end

BTW no need to collect the range in grid.

miguelborrero · June 24, 2024, 10:56pm

Agreed, I just wanted to refer to the fact that it need to be a vector. Just a container or list of some form like a tuple. But thanks for the comment.

miguelborrero · June 24, 2024, 11:48pm

Thanks a lot for the suggestions. However, I had the function operating on vectors on purpose since I found that the banchmarking results where better. See the example below:

function test()

    u = zeros(Float64, 30)
    a = [1.0, 0.0, 0.0, 0.0, 0.0]

    function hermite5_cdf_temp(a::AbstractArray{TV, 1}, μ::TV, σ::TV) where TV
        func(x) = σ * Polynomial(a)((x - μ) / σ) * pdf.(Normal(μ, σ), x) / 24.0 + cdf(Normal(μ, σ), x)
        return func
    end

    hermite5_cdf_temp(a, 1.0, 1.0).(u)

end

Which benchmarks to:

and then compare to this, which is the same function but operating on vectors:

function test()

    u = zeros(Float64, 30)
    a = [1.0, 0.0, 0.0, 0.0, 0.0]

    function hermite5_cdf_temp(a::AbstractArray{TV, 1}, μ::TV, σ::TV) where TV
        func(x) = σ * Polynomial(a).((x .- μ) / σ) .* pdf.(Normal(μ, σ), x) / 24.0 .+ cdf.(Normal(μ, σ), x)
        return func
    end

    hermite5_cdf_temp(a, 1.0, 1.0)(u)

end

and benchmarks to:

Therefore I would like to keep it acting on vectors, thanks! Any ideas given this?

stevengj · June 25, 2024, 12:05am

In particular, an array/vector of functions is an abstractly typed container, and really inhibits Julia’s type inference. You might be better off with careful use of a tuple of functions (or using fancy tricks like FunctionWrappers.jl), if you don’t want to re-think your whole approach.

Dan · June 25, 2024, 12:39am

In Julia the call operation can be overloaded for types, this allows a nice method to get the functionality in the original post:

using Polynomials, Distributions

struct Hermite5
    a :: Vector{Float64}
    σ :: Float64
    μ :: Float64
end

(h::Hermite5)(x) = h.σ * Polynomial(h.a).((x .- h.μ) / h.σ) .* 
  pdf.(Normal(h.μ, h.σ), x) / 24.0 .+ cdf.(Normal(h.μ, h.σ), x)

function test2()
    u = zeros(Float64, 30)
    a = [1.0, 0.0, 0.0, 0.0, 0.0]
    h = Hermite5(a,1.0,1.0)

    h.(u)
end

This method might not only look better, but also perform better because of problems with captured variables in closures when handling functions.

miguelborrero · June 25, 2024, 12:51am

It did not perform better but thanks for the comment!

Dan · June 25, 2024, 1:01am

The trick would be to optimize the function application by removing repeated generation/allocation of structures:

struct Hermite6
    a :: Polynomial{Float64, :x}
    D :: Normal{Float64}
    μ :: Float64
    σ :: Float64
    Hermite6(a,μ,σ) = new(Polynomial(a), Normal(μ, σ), μ, σ)
end

(h::Hermite6)(x) = h.σ * (h.a).((x .- h.μ) / h.σ) .* 
  pdf.(h.D, x) / 24.0 .+ cdf.(h.D, x)

function test3()
    u = zeros(Float64, 30)
    a = [1.0, 0.0, 0.0, 0.0, 0.0]
    h = Hermite6(a,1.0,1.0)

    h.(u)
end

In the OP, everytime the function is called a Normal and a Polynomial struct needs to be created (and benchmarking shows about 3x allocation).

Topic		Replies	Views
Broadcasting call to a vector-valued function Performance broadcast	4	318	November 11, 2023
0.6: how do you broadcast `call` over a vector of functions? General Usage broadcast	26	4363	March 23, 2017
Best way to vectorize a function General Usage package	6	2570	July 6, 2022
What's the "right" way to broadcast vector-valued functions? New to Julia broadcast , array	4	1596	November 6, 2019
Fast evaluation of multiple functions on multiple arguments Performance	10	2386	June 7, 2019

Optimal way to broadcast over a container of funcs

Related topics