Parallel processing using FLoops

Hi, I am trying to enhance the performance of my Julia code by using all the cores on my computer. For this, I am using the FLoops package. My code is simple: I am generating 100 instances of a random density matrix of size 1000 x 1000 and calculating the mean entanglement entropy. Unfortunately, I cannot get the desired speed up for the below code:

I am using the Distributions for generating random matrices, and the QuantumInformation package for calculating the entropy. Is there any way to speed it up, or I have got it wrong?

You likely won’t get significant speedups from paralellism, since the matrix multiplication and matrix logarithm are already multithreaded (by BLAS). That said, I think the algorithm can likely be sped up significantly.

Can you elaborate on this, please?

is there not a way to do this without materialize a huge matrix and then another one and a third one?

I am not really sure what you mean, but since what we have is a random matrix, we need to take several instances of it and then take the average.

For one, thing C*C' can be sampled directly (it’s a Wishart distribution). I’m not actually convinced that there isn’t a relatively simple scalar distribution you can sample directly that would give the same result.

Well as far as I know, this is the procedure to generate a Wishart ensemble, because we have to make sure that the matrix is positive semidefinite, and any positive semidefinite matrix has this form C*C'.

https://juliastats.org/Distributions.jl/stable/matrix/#Distributions.Wishart

But I still think even Distributions.Wishart internally would implement X*X'. But, I was expecting to parallelize the code like this: I need to generate 100 instances of the matrix, and I have 10 threads on my computer. Is it not possible that the work is parallelly divided among all the threads such that each gets to handle 10 matrices and I get a 10x speed up? I am new to this, so this might sound naive.

you might be correct.

no, as Oscar said, the C * C' is already multi-threaded because BLAS is multi-threaded, so dividing like this shouldn’t give you linear speed up

Thanks @Oscar_Smith and @jling . I get it now.

julia> function g()
           e = 0.0
           C = Matrix{Float64}(undef, 1000, 1000)
           ρ = similar(C)
           for _ = 1:100
               rand!(Normal(), C)
               LinearAlgebra.mul!(ρ, C, C')
               ρ ./= tr(ρ)
               e += vonneumann_entropy(ρ)/log(2)
           end
           e
       end

doesn’t seem to make it faster but at least it reduces memory allocation by 70%

1 Like

why do I get rand! not defined error?

using Distributions, QuantumInformation, Random, LinearAlgebra

ok, I see thanks again! Also can you please suggest any article which explains the usage of ! in functions in julia?

https://docs.julialang.org/en/v1/manual/variables/#Stylistic-Conventions

it’s nothing special, just the name of the function ending with ! is a hint that it modifies one of its arguments, that’s it, just a naming convention for functions.

Oh Alright! Thank you!