Avoid intermediate arrays in reduction of broadcast

I have some code that looks like this:

A = randn(5,4,2)
B = randn(5,1,1)
C = zeros(5,4,1)
sum!(C, A .* B)  # avoid forming `A.*B` ?

Is there a simple way to avoid forming the intermediate array A.*B? I was looking into the reducing functions in Base but I couldn’t find a suitable function for this.

I know I can write a kernel by hand, but if there is something in Base for this that I’m missing I would like to exploit it.

Try

sum(x->*(x...), zip(A,B))

Edit: Sorry, didn’t notice the different array sizes

A = randn(5,4,2)
B = randn(5,1,1)
@time sum(A .* B)  #  0.000016 seconds (8 allocations: 624 bytes)
b = Base.broadcasted(*, A, B)
@time sum(b) # 0.000007 seconds (5 allocations: 176 bytes)

With larger arrays, the number of allocations are kept constant with the second approach

A = randn(500,4,2)
B = randn(500,1,1)
@time sum(A .* B)  # 0.000028 seconds (9 allocations: 31.547 KiB)
b = Base.broadcasted(*, A, B)
@time sum(b) # 0.000034 seconds (5 allocations: 176 bytes)

Not in base, but you can do this:

julia> using LazyArrays

julia> sum!(C, @~ A .* B)

Or possibly now sum!(C, LazyArray(@~ A .* B)) after a recent change. The difference is that sum(::Broadcasted) is not especially quick, although this will be fixed: #31020.

1 Like

Well that’s not correct, have a look at Base.broadcasted(*, A, B). It does not form the intermediate result unless you call materialize on it.

Unfortunately …

julia> sum!(C, Base.broadcasted(*, A, B))
ERROR: MethodError: no method matching sum!(::Array{Float64,3}, ::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3},Nothing,typeof(*),Tuple{Array{Float64,3},Array{Float64,3}}})

Why do you need C? You didn’t want to form another array, and no array is needed for the reduction using broadcasted
sum( Base.broadcasted(*, A, B))

But as mentioned, this approach seems to be about twice as slow currently.

Looking at that issue, it seems like dot (docs) will do the job for this case also (unless the arrays are complex in which case there will be an extra complex conjugate on the first argument).

I need C to store the result of the sum, which is also an Array.

As you wrote it, the result of sum is a scalar. Sum only returns arrays if you provide the dims keyword.

That’s my original example. Note the dimensions of C. Also note it’s sum!, not sum.

That was badly worded, I just meant that my solution was not in Base. Although how easily it can be done with only Base I don’t know. Writing sum(Base.broadcasted(*, A, B), dims=3) is also an error, which is the equivalent of sum!(C,... without a pre-allocated destination.

But even when reducing on all dimensions, there is still an issue of which methods of sum get used; LazyArray makes something <: AbstractArray which gets summed more efficiently right now, but there is a PR to fix that eventually.

1 Like