Fastest way of contracting arrays

Frisus95 · July 10, 2021, 1:11pm

Good afternoon everyone!

I have the following piece of code, which essentially contracts some multidimensional arrays along specific dimensions (this is a no-sense example, which has the only purpose of illustrating clearly the situation):

d = 10
A=randn(d,d,d,d,d)
B=randn(d,d,d,d,d)

function funza()
acc = 0.0
for i = 1:1:d, j= 1:1:d, k= 1:1:d, l= 1:1:d, m = 1:1:d
acc += A[i,1,1,1,1]A[l,1,1,1,1]A[k,1,1,1,1]A[j,1,1,1,1]A[m,1,1,1,1]*B[m,l,k,j,i]
end
return acc
end

I get the following time of execution:

@time funza()
0.209131 seconds (2.12 M allocations: 61.713 MiB, 1.55% gc time)

But I’m 99% sure that there are ways much faster than this in order to perform the above contraction.

I thought about using TensorOperations.jl, but (as far as I can see) I don’t have the possibility to set explicitly some values for the indices (in the above no-sense example, all explicit indices have value 1): I obviously can’t compute the full tensor, which has d^(20) entries.
I also thought about using CUDA.jl, but I’m not sure that there are ready-made methods for this type of operations… am I wrong?

In any case, how can I improve the performance of this kind of contraction?

Thank you!

Oscar_Smith · July 10, 2021, 1:22pm

A, B, and D are currently globals. If you pass them into the function like

d = 10
A=randn(d,d,d,d,d)
B=randn(d,d,d,d,d)
function funza(A, B, d)
    acc = 0.0
    for i = 1:1:d, j= 1:1:d, k= 1:1:d, l= 1:1:d, m = 1:1:d
        acc += A[i,1,1,1,1]*A[l,1,1,1,1]*A[k,1,1,1,1]*A[j,1,1,1,1]*A[m,1,1,1,1]*B[m,l,k,j,i]
    end
    return acc
end
@time funza(A,B,d)

it takes 370μs

Frisus95 · July 10, 2021, 1:26pm

That’s astounding! Thank you very much.

Just out of curiosity: is there an “easy” way to do the same thing with the GPU?

Because it might be useful to use very high d, and in that case I would certainly have a boost using CUDA.jl …

mcabbott · July 10, 2021, 1:31pm

julia> @btime funza($A,$B,$d)
  262.666 μs (0 allocations: 0 bytes)
51.78137833703912

julia> using Tullio

julia> @btime @tullio _ := $A[i,1,1,1,1]*$A[l,1,1,1,1]*$A[k,1,1,1,1]*$A[j,1,1,1,1]*$A[m,1,1,1,1]*$B[m,l,k,j,i]
  124.750 μs (1 allocation: 16 bytes)
51.781378337039484

julia> using LoopVectorization

julia> @btime @tullio _ := $A[i,1,1,1,1]*$A[l,1,1,1,1]*$A[k,1,1,1,1]*$A[j,1,1,1,1]*$A[m,1,1,1,1]*$B[m,l,k,j,i]
  31.541 μs (27 allocations: 720 bytes)
51.78137833704028

This example unfortunately won’t work on the GPU right now, scalar reductions are tricky. Something along these lines may work, though:

julia> using TensorOperations

julia> function funza_ten(A, B)
         v = A[:,1,1,1,1]
         @tensor out = v[i]*v[l]*v[k]*v[j]*v[m]*B[m,l,k,j,i]
       end
funza_ten (generic function with 1 method)

julia> @btime funza_ten($A, $B)
  247.417 μs (123 allocations: 8.36 KiB)
51.781378337036784

gustaphe · July 10, 2021, 1:49pm

Please use triple-backtick fences to make your code legible.

Frisus95 · July 10, 2021, 1:51pm

ok!

Frisus95 · July 10, 2021, 2:05pm

How can I save the result of the computation by using the Tullio package?
That is, if I use the latter in the following way:

@btime @tullio result := A[i,1,1,1,1]*A[l,1,1,1,1]*A[k,1,1,1,1]*A[j,1,1,1,1]*A[m,1,1,1,1]*B[m,l,k,j,i]

then the variable “result” is not defined…

nilshg · July 10, 2021, 4:28pm

I don’t use Tullio but in general @btime x = ... will leave x undefined, so you’ll probably want to remove that.

DNF · July 10, 2021, 5:44pm

Yeah, you definitely should remove @btime. It takes much longer, because it runs the code many times, and after that it throws away the result.

Probably not what you want.

@btime is just for benchmarking, not to be included in your actual code.

Topic		Replies	Views
Tensor Contractions for N-dimensional tensors Performance package , tullio , tensors	5	1534	June 22, 2020
Using TensorOperations.jl when contracting tensors not following strict Einstein summation convention Specific Domains question , tensors , tensoroperations	8	324	April 17, 2024
How do I implement the fastest tensor contractions Performance question , tensors , tensor-contraction , tensoroperations	11	864	April 18, 2024
Improving performance of tensor contractions Performance question , blas , tensors	7	1649	June 14, 2019
Tensor multiplication Performance arrays , tensors	9	6664	March 20, 2019

Fastest way of contracting arrays

function funza() acc = 0.0 for i = 1:1:d, j= 1:1:d, k= 1:1:d, l= 1:1:d, m = 1:1:d acc += A[i,1,1,1,1]*A[l,1,1,1,1]*A[k,1,1,1,1]*A[j,1,1,1,1]*A[m,1,1,1,1]*B[m,l,k,j,i] end return acc end

@time funza() 0.209131 seconds (2.12 M allocations: 61.713 MiB, 1.55% gc time)

Related topics

function funza()
acc = 0.0
for i = 1:1:d, j= 1:1:d, k= 1:1:d, l= 1:1:d, m = 1:1:d
acc += A[i,1,1,1,1]A[l,1,1,1,1]A[k,1,1,1,1]A[j,1,1,1,1]A[m,1,1,1,1]*B[m,l,k,j,i]
end
return acc
end

@time funza()
0.209131 seconds (2.12 M allocations: 61.713 MiB, 1.55% gc time)