Parallell TensorKit

Based on Please read: make it easier to help you - #2

Avoid extending the original question that has already been answered with many new questions. It is better to start a new thread.

Let me start a new thread than from TensorKit efficiency

For the following code


using TensorKit
using BenchmarkTools 

na = nb = nc = nd = 12
A = rand(na, nb)
B = rand(nb, nc, nd)
C = zeros(na, nc, nd)


function tensorkit_contraction(A,B,C)
    @tensor  C[a,c,d] := A[a,b] * B[b,c,d]
end 

@btime tensorkit_contraction(A,B,C)

name as 1.jl. I tried to run julia 1.jl -t 1 and julia 1.jl -t 2 (julia 1.8.5)
I got
3.900 μs (1 allocation: 13.62 KiB) and 3.837 μs (1 allocation: 13.62 KiB)
Looks similar. Is there any simple way to parallel tensorkit?

Don’t know about TensorKit.jl but Tullio.jl typically has good parallel performance. Maybe you want to try it out.

2 Likes

I think your arguments here should be swapped around. It should be julia -t 1 1.jl instead. You can check by printing out Threads.nthreads().

Also, since these are linear algebra operations, it will likely use BLAS threads for a lot of the parallelism, which are separate from Julia threads. This can be set with

using LinearAlgebra;
BLAS.set_num_threads(1)

for example.

It’s hard to know which type of parallelism a library will use without looking at the docs/source. But you can experiment to find out by setting Julia threads and BLAS threads separately.

Thanks. I tried

using TensorKit
using Tullio
using BenchmarkTools

na = nb = nc = nd = 12
A = rand(na, nb)
B = rand(nb, nc, nd)
C = zeros(na, nc, nd)



function tensorkit_contraction(A,B,C)
    @tullio   C[a,c,d] := A[a,b] * B[b,c,d]
end

@btime tensorkit_contraction(A,B,C)

but did not get speed up by julia -t 1 1.jl or julia -t 2 1.jl

Your problem size seems too small to gain anything from multithreading. Try for larger na, nb, nc, nd.

1 Like

Thanks! na = nb = nc = nd = 120
julia -t1 1.jl 171 ms
julia -t 2 1.jl 85 ms