Broadcasting functions with multiple arguments in same dimension using Mapslices

Hi everyone,
I have a simple code as shown below. Both a and y are 4D matrices. I want the function f(x, y) to operate on the third dimension, summing the x matrix along the 3rd dimension, summing the y matrix along the 3rd dimension, and then adding the results together.

julia> a = reshape(1:24, (2,2,3,2))
julia> y = ones(2,2,3,1)
julia> f(x,y) = sum(x) .+ sum(y)
julia> mapslices(x -> f(x,y), a, dims=3)

However, when using mapslices to broadcast in the third dimension, the result is not as expected. While sum(x) correctly operates along the 3rd dimension, sum(y) ends up summing the entire matrix (returning a scalar) instead of just the 3rd dimension.

Is mapslices possible to handle multiple arguments in the same dimension?
Is there any other way to do this?
Any suggestions will be appreciated.

Just write a loop?

I agree a for loop would absolutely work in this example. But my actual data is a (500x1000x250x300) 4D matrix. Rather than a simple operation like sum, I need to interpolate at each 1st, 2nd, and 4th dimension. So that would be 3 for loops. Broadcasting would save a lot more time if this is possible.

What time? In terms of runtime, broadcasting isn’t faster than a well-written loop, and could easily be slower for something complicated where you have to wrestle with the broadcast machinery to get it to do what you want. In terms of your time, wrestling with the broadcast machinery and mapslices to contort it into doing what you want may take more time, and result in less readable code, than a simple loop (or nested loop).

That being said, it is always fun to see what clever ways people have of expressing computations as compositions of library functions.

Thanks for your thoughts. Yes, I am referring to the runtime. I think you’re right, broadcasting may not always be faster than using for loops. However, it would be beneficial to test the efficiency of both approaches to save as much time as possible when dealing with large matrices.

Maybe this is what you want?

julia> map(f, eachslice(a; dims=3), eachslice(y; dims=3))
3-element Vector{Float64}:

You can also compute this without dropping the singular dimensions as follows:

julia> sum(a; dims=(1, 2, 4)) + sum(y; dims=(1, 2, 4))
1×1×3×1 Array{Float64, 4}:
[:, :, 1, 1] =

[:, :, 2, 1] =

[:, :, 3, 1] =

broadcast is implemented in Julia, with loops. So, by definition, it is always possible (if you know what you’re doing) to write a loop that runs at least as fast as a broadcast.

This isn’t something like Numpy or Matlab where “built-in” functions have a huge advantage over user-written loops.

Thanks for the explanation. So perhaps the best thing for me to do is to optimize the for loops rather than trying to use mapslices, which might not work for multiple arguments in the same dimension.

Thanks for your input @danielwe, your code is close to what I want, but what I need is:
julia> sum(a; dims=3) .+ sum(y; dims=3) or similarly: julia> map(f, eachslice(a; dims=(1,2,4)), eachslice(y; dims=(1,2,4))) (though this code won’t work due to dimension mismatch).
Note that sum is just a simple example, but really the function is more complicated in my application.

Works for me:

julia> sum(a; dims=3) .+ sum(y; dims=3)
2×2×1×2 Array{Float64, 4}:
[:, :, 1, 1] =
 18.0  24.0
 21.0  27.0

[:, :, 1, 2] =
 54.0  60.0
 57.0  63.0

Works for me:

Ah, I was referring to julia> map(f, eachslice(a; dims=(1,2,4)), eachslice(y; dims=(1,2,4)))

It better if you can post a working example that does what you want, even just using loops. Then you can ask for help in making the loops faster and/or replacing them with more concise function calls.


Yes, that’s a good idea. Here is my working example: I am running a simulation in the context of physical oceanography.

I have a 4D matrix u (velocity) with x, y, z (Cartesian coordinates), and time in the 4th dimension. The domain has rough topography at the bottom, so the deepest depth at each x and y is not the same, similar to our real ocean. My goal is to take terrain-following horizontal averages, i.e., dims=(1,2), in the height-above-bottom coordinate.

For example, if I want to take the horizontal average of u at 20 meters above the topography, I need to find every point in the x and y (horizontal) dimensions that are 20 meters above the corresponding topography and average all these points.

This can be done using for loops. Thanks to @stevengj’s suggestion, my goal is to optimize the code as efficiently as possible owing to the fact that broadcasting functions might not be more efficient. So any suggestions or tips on optimization would be great.
The code for my terrain-following average is below:

using Interpolations
Nx = 500; Ny = 1000; Nz = 250;
# hab: height above bottom coordinate;
# zC is a vector indicating the vertical grid points
# z_topography(x,y) is a 2D matrix showing the depth of the topography at each points
hab = ones(Nx,Ny).*reshape(zC,1,1,Nz) .- z_topography  
new_height = 0:40:3000.   # desired height above bottom grids
hab_interp = zeros(Nx,Ny,Nz)

function terrain_following_avg(hab, u, new_height)

    u_interp = zeros(Nx, Ny, length(new_height), size(u,4))
    for i in 1:Nx, j in 1:Ny
        for tt = 1:size(u,4)
        hab_interp = hab[i, j, :]
        itp = interpolate((hab_interp,), u[i, j, :, tt], Gridded(Linear()))
        itp_extrapolated = extrapolate(itp, Interpolations.Flat())
        u_interp[i, j, :, tt] .= itp_extrapolated.(new_height)
    return dropdims(mean(u_interp,dims=(1,2)) , dims=(1,2))
# call the function to get the terrain following averaged velocity (assuming u is a 4D matrix) 

u_avg = terrain_following_avg(hab, u, new_height);

The result should be a 2D matrix in terms of z and time. (I am not sure if this post should be a new discussion since this is a little off track to my original question regarding broadcasting.)

Don’t use global variables inside your function — pass them as parameters. (Sizes like Nx don’t need to be passed explicitly because you can get them from the array sizes.) Use @views for slices without copies. Fix your loop order, and consider re-ordering the dimensions of your array, so that you access the array contiguously (column-major). These are all in the Julia performance tips.

Don’t allocate the u_interp array just to compute its mean at the end. Pre-allocate the result (mean) array and sum directly into it as you compute the entries.

1 Like

Thanks so much for your tips! I will try them out!