TensorFlow `matmul` equivalent in Julia: matrix multiplication with two given tensor dimensions

Does Julia have an equivalent of TensorFlow matmul ? I need matrix multiplication using two given (or pre-defined) dimensions of the tensors, preserving other dimensions.

Tensorflow matmul uses the two innermost tensor dimensions for matrix multiplication, and preserves the remaining dimensions. In pseudocode, A_ijkmn=sum_x(B_ijkmx * C_ijkxn), summing across dimension x , which is the last dim of B and second last dim of C. So, the innermost dims must agree for multiplication, and the preserved dims must be the same between the two tensors. The actual indices do not matter.

I need this to play nicely with CuArrays and Zygote . The ultimate goal is to use it for a deep learning model.

I am aware of TensorOperations, and might end up using that. The drawbacks are:

  • I would not want to pull in a new dependency if there is similar functionality in packages that come with Julia Distro
  • Dimensions need to be specified explicitly

mul!(c, a, b)

I think the right term for this is batched matrix multiplication. For every value of i,j, there is one mul! to be done.

TensorOperations.jl doesn’t allow this operation. And isn’t by default Zygote-differentiable. (Although it does now do CuArrays.)

You can use BatchedRoutines.jl and its Cu friend for this. But they don’t handle derivatives. IIRC you can borrow that from Transformers.jl if you dig a bit.

There is also a PR to add some of this to NNlib.

I think this functionality is available in OMEinsum and (possibly without the derivs) in TensorOperations. Both packages need specifying the dimensions explicitly, though.

Yes I just saw that this may have been be added to OMEinsum too, master? (Can’t seem to install to check.) But TensorOperations does not allow this:

julia> using TensorOperations

julia> B = rand(2,2,2,2,2); C = rand(2,2,2,2,2);

julia> @tensor A[i,j,k,m,n] := B[i,j,k,m,x] * C[i,j,k,x,n]
ERROR: TensorOperations.IndexError{String}("non-matching indices between left and right hand side: \$(Expr(:(:=), :(var\"##255\"[i, j, k, m, n]), :(var\"##253\"[i, j, k, m, x] * var\"##254\"[i, j, k, x, n])))")
pkg> add OMEinsum

Works for me.

TensorOperations does not allow this

Weird, this must be a recently introduced bug. Their docs claim they can do it.