Broadcasting call to a vector-valued function

What is the most julian way to broadcast a call to a vector-valued function?

For example, take the following function.

function return_vec(x)
    return [x, x^2]

I would like to collect outputs for each x in 1:15 and store the result in a 15x2 Matrix. I can do either of the following quickly:

julia> @btime hcat(return_vec.(1:15)...)';
  548.913 ns (22 allocations: 2.11 KiB)

julia> @btime hcat(map(return_vec,1:15)...)';
  563.187 ns (22 allocations: 2.11 KiB)

but somewhat inefficiently. Alternatively, I can define:

function manual_broadcast(X)
    out = zeros(eltype(X), length(X),2)
    for i in eachindex(X)
        out[i,:] .= return_vec(X[i])
    return out

And get a 1.74x speedup:

julia> @btime manual_broadcast(1:15);
  316.522 ns (16 allocations: 1.47 KiB)

But is there any syntactic sugar that would effectively implement manual_broadcast? Surely this is common enough that I don’t need to write my own function to do it?


julia> function return_vec(x)
           return (x, x^2)
return_vec (generic function with 1 method)

julia> @btime stack(return_vec.(1:15); dims=1);
  58.087 ns (2 allocations: 608 bytes)

… follow the requirements of the question? It’s pretty fast.


Yep, exactly what I was looking for. Thanks! Hadn’t heard of stack before.

But when I run it, I get

@btime stack(return_vec.(1:15); dims=1);
  306.410 ns (17 allocations: 1.64 KiB)

Why so many more allocations??

Figured it out. You changed the return to a tuple. Makes sense.

1 Like

You can save a factor of 2 in time and memory by not allocating the temporary array from broadcasting, and instead pass return_vec as the first argument of stack:

julia> @btime stack(return_vec.(1:15); dims=1);
  96.559 ns (2 allocations: 608 bytes)

julia> @btime stack(return_vec, 1:15; dims=1);
  51.247 ns (1 allocation: 304 bytes)