Using argument vectors along specific dimensions during broadcast()

I’m wondering if there is a way to use vector arguments along specific dimensions during broadcast().

For example, consider a function f(x,y,z) takeing three arguments corresponding to the 3D Cartesian coordinates. I have the x-, y-, z-coordinates as three separate vectors: xs::Vector{Float64}, ys::Vector{Float64}, zs::Vector{Float64}. I would like to broadcast f() over the 3D grid defined by xs, ys, zs. This can be achieved by

Nx, Ny, Nz = length(xs), length(ys), length(zs)
xs2 = reshape(xs, (Nx, 1, 1))
ys2 = reshape(ys, (1, Ny, 1))
zs2 = reshape(zs, (1, 1, Nz))

F = broadcast(f, xs2,  ys2, zs2)

but reshape() is allocating. So, I’m wondering if there is a way to achieve the above without reshape(), probably something like

F = broadcast(f, xs,  ys, zs, dims=(1,2,3))

where dims specify the dimension along which the three arguments need to be aligned.

Or, is there a way to turn a vector into an arbitrary dimensional linear array without allocation? I tried something like permutedims(), but it didn’t work.

reshape should not allocate.

It does a small amount for builtin Arrays (it’s just for the array header, not copying the data). You unfortunately need to explicitly opt-in to the pure-Julia reshape with ReshapedArray. (cf #36313).

Instead of broadcasting, I’d use Iterators.product here:

F = [f(x,y,z) for (x,y,z) in Iterators.product(xs, ys, zs)]

Perhaps

using Tullio
@tullio F[i,j,k] := f(x[i], y[j], z[k])

@mbauman and @hendri54, thanks for the cool tricks. Actually I’m doing on GPU as well, and both methods use scalar indexing, which is slow on GPU. For example, with a setup

using CUDA
CUDA.allowscalar(false)
xs = cu(rand(10)); ys = cu(rand(20)); zs = cu(rand(30));
f(x,y,z) = x+y+z

I get

julia> F = [f(x,y,z) for (x,y,z) in Iterators.product(xs, ys, zs)]
ERROR: Scalar indexing is disallowed.
...

and

julia> using Tullio

julia> F = cu(rand(10,20,30));

julia> @tullio F[i,j,k] := f(xs[i], ys[j], zs[k])
ERROR: Scalar indexing is disallowed.

Wondering if there is a GPU-friendly workaround.