Hi, I am trying to enhance the performance of my Julia code by using all the cores on my computer. For this, I am using the FLoops package. My code is simple: I am generating 100 instances of a random density matrix of size 1000 x 1000 and calculating the mean entanglement entropy. Unfortunately, I cannot get the desired speed up for the below code:
You likely won’t get significant speedups from paralellism, since the matrix multiplication and matrix logarithm are already multithreaded (by BLAS). That said, I think the algorithm can likely be sped up significantly.
For one, thing C*C' can be sampled directly (it’s a Wishart distribution). I’m not actually convinced that there isn’t a relatively simple scalar distribution you can sample directly that would give the same result.
Well as far as I know, this is the procedure to generate a Wishart ensemble, because we have to make sure that the matrix is positive semidefinite, and any positive semidefinite matrix has this form C*C'.
But I still think even Distributions.Wishart internally would implement X*X'. But, I was expecting to parallelize the code like this: I need to generate 100 instances of the matrix, and I have 10 threads on my computer. Is it not possible that the work is parallelly divided among all the threads such that each gets to handle 10 matrices and I get a 10x speed up? I am new to this, so this might sound naive.