Multi-arguments mapslices

Hello,

I am looking for an efficient way to do this kind of operation (not that one, it’s an MWE):

julia> A = rand(3,3,5);

julia> B = rand(3,3,5);

julia> mult_sum(a::AbstractMatrix, b::AbstractMatrix) = sum(a*b, dims =1)
mult_sum (generic function with 1 method)

julia> mapslices(mult_sum, A, B, dims = 3)
ERROR: MethodError: no method matching mapslices(::typeof(mult_sum), ::Array{Float64, 3}, ::Array{Float64, 3}; dims=3)
Closest candidates are:
  mapslices(::Any, ::AbstractArray; dims) at abstractarray.jl:2828
Stacktrace:
 [1] top-level scope
   @ REPL[31]:1

The issue is that mapslices only works with a single Array. I know I could work with eachslice and broadcast, but the result would be a Vector of Matrix. But I need the result to remain a 3D Array. Reducing with cat means reallocating and is therefore inefficient. Is there any package out there to work with this kind of stuff?

Further precision: the operation needs to be differentiable and to work with CUDA, that means no array indexing and no array mutation…

I think you can solve this using LazyArrays.jl:

using LazyArrays
reduce(+, @~(A.*B), dims=1,init=0.0)

For reference, I am linking a few related topics:

This example does an element-wise multiplication of the two arrays. What I’m looking for is a way to do matrix multiplications. For another example, say I have the same A and B matrices, I’d like some way of doing:

C = A * B 

where C[:,:,k] = A[:,:,k] * B[:,:,k] and thus C would have dimensions 3x1x5.
It kind of looks like what Tullio does in one of your links but it looks like this package is made for Tensors (in the non-ML sense).

I don’t understand the notation (thus what I’m doing) but Tullio works for the matmul:

@tullio C[i, j, k] := A[i,l,k]*B[l,j,k]

I think I can work it out from here.