Tensor Contractions for N-dimensional tensors

I am working with exponentially sized tensors (exponential in a given variable N, for my case to the basis 4). Up to now, I stored the tensors linearly indexed, that is in a vector from 1:4^N. Contractions where therefore straightforward using packages like TensorOperations.jl for example.
In a next step I want to utilize the sparsity of the matrices I wish to contract the vector with. It is therefore beneficial (on paper) to write the vector as an N dimensional tensor like this: [1:4, 1:4, …, 1:4], because then I only need contraction with ONE of the N 1:4 blocks (the others drop out due to the sparsity).
There are a few ways to achieve this. I could store the vector linearly indexed and then “by hand” (i.e. using the digits function with base 4) find the relevant indices that I need to contract. This would be my way to go, but here it is said that this might not be a good idea.
The other way would be to store the vector like this [1:4, 1:4, …, 1:4] and then use TensorOperations.jl base functions contracttensors!(…) to say which indices to contract. This feels kind of hacky because you need an N item string and in that string you need to set the k-th element such that it contracts as desired. I am worried about performance, and this surely does not sound optimal.
However, I could not find a better option. Ideally there’d exist a function contract(A, B, 2, 5) to contract tensors A and B along dimensions 2 and 5 respectively. Did anyone ever run into a similar problem and if so, how did you solve it? Thanks a lot!

Using TensorOperations.tensorcontract sounds reasonable. For concreteness, I believe a function like this one is roughly what you meant (correct me if I am wrong).

function contract(A,B,kA,kB)
    IA = collect(-ndims(A):-1)
    IB = collect(1:ndims(B))
    IA[kA] = 0
    IB[kB] = 0
    return tensorcontract(A,IA,B,IB)
end

I agree that creating IA and IB seems wasteful, but actually it’s not that bad. The actual contraction scales exponentially in N while creating IA and IB scales only linearly in N, so most likely the cost of creating IA and IB will be negligible compared to the contraction.

1 Like

Yes, I came up with the same thing but without the nice form for IA and IB. Thanks a lot, I’ll be using this!

This might be something for @mcabbott’s Tullio.jl ?

Here’s a quick test with N=4; TensorOperations will do better at larger N, and Einsum better at smaller N:

julia> A = rand(4,4,4,4); B = rand(4,4,4,4);

julia> using TensorOperations, Einsum, Tullio, LoopVectorization
julia> ten33(A,B) = @tensor C[a,b,c,x,y,z] := A[a,b,s,c] * B[x,y,s,z]
julia> ein33(A,B) = @einsum C[a,b,c,x,y,z] := A[a,b,s,c] * B[x,y,s,z]
julia> tul33(A,B) = @tullio C[a,b,c,x,y,z] := A[a,b,s,c] * B[x,y,s,z]  tensor=false

julia> @btime ten33($A,$B);
  12.612 μs (80 allocations: 36.94 KiB)

julia> @btime ein33($A,$B);
  11.752 μs (3 allocations: 32.20 KiB)

julia> @btime tul33($A,$B);
  3.596 μs (14 allocations: 32.56 KiB)

To make a function which accepts the dimensions, for Einsum you can make a generated function contract(A, B, Val(3), Val(3)) which writes the macro. But Tullio defines functions and thus doesn’t play well with @generated, unfortunately.

You could write a function which always contracts the 2nd index, and then reshapes. Something like this:

sizeA3 = prod(size(A,d) for d in 1:kA-1), size(A,kA), prod(size(A,d) for d in kA+1:ndims(A))
A3 = reshape(A, sizeA3)
B3 = reshape(B, ...) # similar

@tullio C4[a,b,x,y] := A3[a,s,b] * B3[x,s,y] avx=true tensor=false

sizeCA = filter(!iszero, ntuple(d->d==kA ? 0 : size(A,d), ndims(A)))
C = reshape(C4, tuple(sizeCA..., sizeCB...))

Edit: a bit surprisingly, the function above is much quicker than the one with @tensor macro.

julia> @btime contract($A,$B,3,3);
  5.962 μs (46 allocations: 39.52 KiB)

julia> function contract2(A,B,kA,kB)
           IA = ntuple(d -> d==kA ? 0 : -d, ndims(A))
           IB = ntuple(d -> d==kB ? 0 : +d, ndims(B))
           return TensorOperations.tensorcontract(A,IA,B,IB)
       end

julia> @btime contract2($A,$B,3,3);
  5.366 μs (47 allocations: 39.47 KiB)
2 Likes

That’s the effect of the cache for temporaries being used with the @tensor macro. It is helpful for large tensor contractions involving several temporaries, but for small test cases like this, it is better to run with TensorOperations.disable_cache().

1 Like