Mapreduce with broadcasting

torrance · February 25, 2022, 9:19am

I want to perform a mapreduce over multiple iterators that don’t have the same dimensions, but which can be broadcast together. For example:

arr1 = rand(1, 1, 3)
arr2 = rand(100, 100, 1)

mapreduce(+, *, arr1, arr2; dims=(1, 2))

ERROR: DimensionMismatch("dimensions must match: a has dims (Base.OneTo(1), Base.OneTo(1), Base.OneTo(3)), b has dims (Base.OneTo(100), Base.OneTo(100), Base.OneTo(1)), mismatch at 1")

Is this possible? Or if not, are there any Julia utilities that can act ‘expand’ the dimensions of an array along 1 or more axes, without expanding the memory.

The only other mention I can see of this is from 4 years ago and looks hacky: Is there something like broadcast_mapreduce?

fabiangans · February 25, 2022, 1:52pm

I think you can use LazyArrays.jl for this.

https://github.com/JuliaArrays/LazyArrays.jl

arr1 = rand(1, 1, 3)
arr2 = rand(100, 100, 1)
arr3 = LazyArray(@~ arr1 .> arr2)
sum(arr3,dims=(1,2))

torrance · February 25, 2022, 3:50pm

Thanks for that suggestion. It seems to work for cpu code, but the ultimate destination for this mapreduce is on the GPU and CUDA interactions with LazyArrays seems to trigger scalar indexing.

For example:

using CUDA
using LazyArrays

arr1 = CUDA.rand(100, 100, 1)
arr2 = CUDA.rand(1, 1, 3)
arr3 = LazyArray(@~ arr1 .+ arr2)
CUDA.reduce(+, arr3, dims=(1, 2))

Warning: Performing scalar indexing on task Task (runnable)

(The reason I’m trying to use mapreduce or reduce here is because these functions are consistently faster than the manual reductions I am able to write myself.)

torrance · February 26, 2022, 6:33am

@maleadt I saw this commit suggesting mapreduce in GPUArrays already supports broadcasting? Is this correct, and if so, is there a flag or similar required to enable it?

jling · February 26, 2022, 7:41am

what does reduce with > do? check if it’s sorted along one dimension?

it may be easier for others to help if you can provide example input/output with an explanation of intention.

notice this doesn’t even work in vanilla julia:

julia> reduce(>, rand(100,100,3); dims=1)
ERROR: MethodError: no method matching reducedim_init(::typeof(identity), ::typeof(>), ::Array{Float64, 3}, ::Int64)

torrance · February 26, 2022, 8:26am

@jling Yeah it’s a poor minimal example, but it does work - the error you’re seeing is due to > having no defined identity element. If you provide an init=0, for example, it will work.

The actual use case here is a type of 2d fourier transform. It’s a bit too heavy to get into the details, but the map operator is a little exponential, and the reduction sums across all the data points onto a 2D grid.

torrance · February 26, 2022, 8:30am

I have one working work-around, but I don’t know if its optimal. It does, however, have the nice property that it works on the GPU too (in pseudocode):

function broadcast_reduce(f, op, A, B, C, ...; dims, init)
    bc = Broadcast.instantiate(Broadcast.broadcasted(f, A, B, C...)
    return reduce(op, bc; dims, init)
end

jling · February 26, 2022, 8:36am

julia> reduce(>, [0,1,2]; init=0)
false

julia> reduce(>, [-1, -2, -3]; init=0)
true

julia> reduce(>, [2, 1, 0]; init=3)
false

I don’t think it does what you want it to do? this is basically checking if everything in the array is smaller than 0?

torrance · February 26, 2022, 8:43am

@jling Please ignore the reduction operator I gave as an example in the first post. You can use + or * or whatever you’d like.

maleadt · March 2, 2022, 10:28am

Kind of, but there’s no user-facing interface. This should be done with LazyArrays, we’re probably just missing some dispatches for those lazy values to be compatible with the GPU broadcast implementation.

Topic		Replies	Views
Is there something like broadcast_mapreduce? General Usage broadcast	10	2072	April 29, 2019
Is it possible to do a `mapreduce` with multiple arrays while broadcasting over so General Usage	4	592	March 18, 2021
Transducers/FLoops equivalent for mapreduce with dims arg (on GPU) Performance gpu , transducers , floops	9	609	May 12, 2022
Multi-arguments mapslices General Usage	3	555	March 16, 2023
Lazy broadcasting speed issues Performance broadcasting , lazy-evaluation , lazy , lazyarray	13	727	March 12, 2023

Mapreduce with broadcasting

Related topics