# How to speed up this Kronecker Multiplication?

I am writing a code that gives me tensor product of the required Pauli matrices, and I am doing this using kron function.

using StaticArrays

#%% We are definig the Pauli matrices here.

σˣ=SA[0 1.0; 1 0] #\sigma^x

σᶻ=SA[1.0 0; 0 -1] #\sigma^y

σʸ=-im*σᶻ*σˣ #\sigma^z

#println(σʸ,"\n",σᶻ,"\n",σˣ)

σvec=[σˣ,σʸ,σᶻ]


the above code stores the Pauli matrices

function σTensor(A,n::Int)
include("Pauli_matrices.jl")
#= Inputs: number of two level systems n (= number of tensor products + 1)
size of the input array is N and is of the form [1,1,3,2] for N=4
1 represents x, 2 represents y and 3 represents z
=#
ret=σvec[A[1]]
for i in 2:n
ret=kron(ret,σvec[A[i]])
end
return ret
end


the above code does the required product.

just for tensor product of 3 matrices it takes 60kib’s
is there a way to improve this or is it the max we can do?

since the output matrix is sparse is there a way to include that into the code?

As usual, the question with Kronecker products is: do you really need to build the matrix explicitly? Or can you consider it as a lazy operator in your code?

1 Like

Don’t structure your code like this, calling include from within the body of a function. It’s ugly.

This happens within the σTensor function, because of the include

Provide a complete reproducible example if you want help.

1 Like

here A=[1,2,3]
I have changed the code in the following way as you have suggested

include("PauliMatrices.jl")
function σTensor(A::Array,n::Int)
#= Inputs: number of two level systems n (= number of tensor products + 1)
size of the input array is N and is of the form [1,1,3,2] for N=4
1 represents x, 2 represents y and 3 represents z
=#
ret=σvec[A[1]]
for i in 2:n
ret=kron(ret,σvec[A[i]])
end
return ret
end


and I am getting the following results for different A

1. is there a way to reduce the number of allocations to 500 or less
2. is there a way to implement sparse matrices here?
(I am new to Julia and also programming in general, so please be noob friendly in your replies)

The usual approach is to just use SparseArrays.jl. I don’t think that any further compiler optimization (i.e. StaticArrays.jl, type-level information) is going to help if you consider > 7 sites.

julia> using SparseArrays, LinearAlgebra, BenchmarkTools
julia> σx = sparse([0 1; 1.0 0])
2×2 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
⋅   1.0
1.0   ⋅

julia> σy = sparse([0 -im; im 0.0])
2×2 SparseMatrixCSC{ComplexF64, Int64} with 2 stored entries:
⋅      0.0-1.0im
0.0+1.0im      ⋅

julia> σz = sparse([1 0; 0 -1.0])
2×2 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
1.0    ⋅
⋅   -1.0

julia> const σvec = [σx, σy, σz] # note the const! It is needed for performance
3-element Vector{SparseMatrixCSC{Tv, Int64} where Tv}:
sparse([2, 1], [1, 2], [1.0, 1.0], 2, 2)
sparse([2, 1], [1, 2], ComplexF64[0.0 + 1.0im, 0.0 - 1.0im], 2, 2)
sparse([1, 2], [1, 2], [1.0, -1.0], 2, 2)

julia> function σTensor(A)
n = length(A) # why did you pass it as parameter?
ret=σvec[A[1]]
for i in 2:n
ret=kron(ret,σvec[A[i]])
end
return ret
end
julia> @btime σTensor([1,2,3], 3)
539.524 ns (13 allocations: 1.08 KiB)
8×8 SparseMatrixCSC{ComplexF64, Int64} with 8 stored entries:
⋅           ⋅          ⋅      …       ⋅      0.0-1.0im       ⋅
⋅           ⋅          ⋅              ⋅          ⋅      -0.0+1.0im
⋅           ⋅          ⋅              ⋅          ⋅           ⋅
⋅           ⋅          ⋅         -0.0-1.0im      ⋅           ⋅
⋅           ⋅      0.0-1.0im          ⋅          ⋅           ⋅
⋅           ⋅          ⋅      …       ⋅          ⋅           ⋅
0.0+1.0im       ⋅          ⋅              ⋅          ⋅           ⋅
⋅      -0.0-1.0im      ⋅              ⋅          ⋅           ⋅


2 Likes

The thing is I don’t know and I am new to this lazy matrices. How will I determine this?
I was using the Static Arrays because I read that they are quicker than normal Array. My end goal is to implement time evolution of a quantum system, as efficiently as possible.
the problem with quantum systems if there are n qubits in the system, the operators in the system are 2^nx2^n complex matrix.

If you want to do time evolution of quantum systems, then a lazy approach will not be helpful. In the simplest case you want to construct your Hamiltonian and then use a eigen decomposition to compute the matrix exponentials efficiently. This will work for up to ~12-14 spins reasonably fast and easy. Beyond that you will need to employ more and more sophisticated tricks

Shameless plug: If you want to study simple spin models, you might want to have a look at my library SpinModels.jl with which you can construct Hamiltonians with single and two-body terms rather efficiently (It is not in the registry for now, so you need to add it using Pkg.add(;url="https://github.com/abraemer/SpinModels.jl")).

Thank you for your answer! . I have implemented your code and it works great it brought down the allocations from 1000 to 10. I definitely will check out your SpinModels.jl. Thanks again.

1 Like