Can this contraction run faster without memory allocation?

Frisus95 · August 21, 2021, 1:51am

Hello everyone!

is it possible to make this code run faster and not allocate more memory than necessary?

julia> function funza(Tensor, indices)

       result = 0.0

       for i = 1:5

       T1 = Tensor[indices[1],indices[2],indices[3],indices[4],indices[5]]

       # if (T1 == 0) continue;  end

       for j = 1:5

       T2 = Tensor[indices[6],indices[7],indices[8],indices[9],indices[10]]

       # if (T2 == 0) continue;  end

       for k = 1:5

       T3 = Tensor[indices[11],indices[12],indices[13],indices[14],indices[15]]

       # if (T3 == 0) continue;  end

       for l = 1:5

       T4 = Tensor[indices[16],indices[17],indices[18],indices[19],indices[20]]

       # if (T4 == 0) continue;  end

       for m = 1:5

       T5 = Tensor[m,l,k,j,i]

       # if (T5 == 0) continue;  end

       result += T5*T4*T3*T2*T1

       end # m
       end # l
       end # k
       end # j
       end # i

       # returning result (irrelevant for this topic)

       end
funza (generic function with 3 methods)

julia> function main(N_iterations)

       indices = rand(1:5,20)

       for n=1:N_iterations

       # indices are supposed to change randomly between iterations 
       # (irrelevant for this topic)

       funza(Tensor, indices)

       # do something with the returning value of funza 
       # (omitted since irrelevant for this topic)

       end

       end
main (generic function with 1 method)

julia> @time main(10)
  0.000018 seconds (1 allocation: 240 bytes)

julia> @time main(10^6)
  1.442957 seconds (1 allocation: 240 bytes)

As I pointed out in the text, I have omitted some sections that are not relevant to this topic (so it goes without saying that the code as it is makes no sense) for the sake of clarity.

I have seen that there are a lot of packages for this type of contractions (Tullio, Einsum etc.) but it is essential that no additional memory is allocated when the indices are not hard constants, but are variable (not all summed!) as in the code .

I haven’t found a package with these features so far, but I’m confident that there is a way to speed this up (use GPU?)

Thanks a lot in advance everyone!

Topic		Replies	Views
Avoid memory allocation in passing arrays Performance question , memory-allocation , tullio	14	2453	August 21, 2021
Fastest way of contracting arrays Performance tullio , loopvectorization	8	747	July 10, 2021
How do I implement the fastest tensor contractions Performance question , tensors , tensor-contraction , tensoroperations	11	865	April 18, 2024
Improving performance of tensor contractions Performance question , blas , tensors	7	1649	June 14, 2019
Reduce allocation in nested for loops Performance question , memory-allocation , tensors , for-loop	2	130	July 24, 2024

Can this contraction run faster without memory allocation?

Related topics