Can this random matrix simulation be sped up?

patrick · September 25, 2020, 5:39pm

I am a Julia novice and converted the following program from Python (for a simple practice exercise). It computes a N by N random matrix of quaternions, drawn from the Gaussian distribution, and then computes the smallest singular value, and repeats this many times. (I have omitted the graphing of the resulting distribution for simplicity.) We use that quaternions may be represented as 2 by 2 complex matrices.

I have a few basic questions, which elude me even after looking through the manual.

The version below hard-codes the constants 100 and 1000. When these are replaced by n and samples in the data function, the code becomes .5 seconds slower (4.8 versus 5.3). Why is this, and is there a way to prevent this slowdown without hard-coding the constants, so that the values are easier to change?
This is a literal translation of a Python script. Are there any easy ways to make it faster or more memory efficient? I feel I’m surely missing a few tricks. In particular, it produces 2.5 GiB of memory allocations, which seems sub-optimal.
Can the compilation of the script be cached, so that I do not re-pay the compilation cost each time the script is run without being edited?
Are there any Julia packages/modules whose source code is particularly well-written and instructive, which I could read to pick up idiomatic, high-performance Julia? I was thinking of the linear algebra library, but then I realized that it’s probably just calls to OpenBLAS, which is not written in Julia, so probably that would not help. Perhaps the ODE/PDE/SDE solving library?

I apologize for any misunderstandings in these questions – I am still an inexperienced programmer, and an even more inexperienced Julia user. Thank you!

using LinearAlgebra

n = 100
samples = 1000

function symplectic(N)
	G = Array{Complex{Float64}}(undef, 2N ,2N)

	for i in 0:N-1
		for j in 0:N-1
			a = randn()
			b = randn()
			c = randn()
			d = randn()
			@inbounds G[2*i + 1,2*j + 1] = a + b*im
			@inbounds G[2*i+1 + 1,2*j+1 + 1] = a - b*im
			@inbounds G[2*i+1 + 1,2*j + 1] = -c + d*im
			@inbounds G[2*i + 1,2*j+1+ 1] = c + d*im
		end
	end

	M = G * G'
	ev = eigvals(M)
	return ev[1]*N
end

function data()
	for i in 1:1000
		println(i," ",symplectic(100))
	end
end

@time data()

ChrisRackauckas · September 25, 2020, 5:52pm

I wouldn’t recommend my own code here since the ODE/PDE/SDE solving libraries are almost akin to being OpenBLAS sometimes, with its own definitions of broadcast variants, exponentiation, etc. There’s some things to learn from it on how to write high-performance generic code, but there’s also a ton of extra things that I wouldn’t recommend people would do unless their goal was to simply make users as happy as possible.

Juan · September 25, 2020, 6:05pm

Instead of generating single scalars with rand() running it many times inside the nested loops you could create 100x100 matrices with rand(100,100) before the loops.

Sukera · September 25, 2020, 6:07pm

Translating code is a non-trivial task - without the original python code it will be hard to find equivalences or mistakes in your julia version.

lmiq · September 25, 2020, 6:08pm

First, ideally you can pass n and samples as parameters to data():

data(n,nsamples)

avoid using global variables (or, at least, use const n = 100, etc).

You could allocate G in data and pass it as parameter to sympletic. And you could also allocate other auxiliary arrays (M, for instance), and pass it as well as parameter to sympletic.

Something like this (needs checking and still it might be possible to get the eigenvalues into a prealocated array, such that the function should not allocate anything):

using LinearAlgebra

function symplectic(N,G,R)

      for i in 0:N-1
            for j in 0:N-1
                  @. R = randn()
                  @inbounds G[2*i + 1,2*j + 1] = R[1] + R[2]*im
                  @inbounds G[2*i+1 + 1,2*j+1 + 1] = R[1] - R[2]*im
                  @inbounds G[2*i+1 + 1,2*j + 1] = -R[3] + R[4]*im
                  @inbounds G[2*i + 1,2*j+1+ 1] = R[3] + R[4]*im
            end
      end
      ev = svdvals!(G)
      return ev[2*N]*N
end

function data(n,samples)
      G = Array{Complex{Float64}}(undef, 2n ,2n)
      R = zeros(4)
      for i in 1:samples
            symplectic(n,G,R)
      end
end

n = 100
samples = 1000
@time data(n,samples)

(of course you can loop from 1 to N and remove the +1 from the G[...] = .

stevengj · September 25, 2020, 6:14pm

this is equivalent to squaring svdvals(G) (albeit sorted in ascending rather than descending order).

patrick · September 25, 2020, 7:33pm

Thank you everyone for the suggestions! They are helpful, and I am working through them carefully.

Here are my preliminary observations.

Using const indeed removes the penalty from using the global variables (compared to hard-coding), which resolves one of my questions.
Strangely, using svdvals(G) squaring, and taking the element in index end instead of computing the eigenvalues of G * G' as above almost doubles the running time of the program. I’m not sure what’s going on here, but this can be observed both when I run @lmiq’s version, and edit mine to include the SVD.
@inbounds does nothing in this script, and should likely be removed from it for clarity.
Initializing G in data() makes the script about .1 second slower, but uses about 2 GiB instead of 2.5 GiB memory. I don’t see much change initializing the random numbers outside of the function symplectic().

lmiq · September 25, 2020, 7:48pm

Keep in mind that for a more proper benchmarking you should better use BenchmarkTools. Or, at least, run the code twice, the first to compile, the second to measure the time and allocations.

patrick · September 25, 2020, 7:53pm

Thank you for pointing this out. Doing this, I get virtually identical results in time to my original script (4.8 seconds) but with less memory usage (1.9 vs 2.5 GiB), so your suggestion is indeed a win. Sorry for the mistake.

I still wonder if there is a way to cache the compilation between successive runs of the same script…

lmiq · September 25, 2020, 7:56pm

If two executions occur in the same Julia section functions that did not change (and didn’t receive different types of parameters) won’t be compiled again.

rdeits · September 25, 2020, 7:57pm

Check out GitHub - timholy/Revise.jl: Automatically update function definitions in a running Julia session which should allow you to revise your code without restarting Julia. That should remove most of the compilation burden, since you will only have to recompile the code you’ve changed.

lmiq · September 25, 2020, 10:41pm

In my test the allocations are reduced to 250 Mb:

 17.909110 seconds (616.27 k allocations: 256.346 MiB, 1.37% gc time)

This is the code:

using LinearAlgebra

function symplectic(N,G,R)

	for i in 0:N-1
		for j in 0:N-1
			@. R = randn()
			@inbounds G[2*i + 1,2*j + 1] = R[1] + R[2]*im
			@inbounds G[2*i+1 + 1,2*j+1 + 1] = R[1] - R[2]*im
			@inbounds G[2*i+1 + 1,2*j + 1] = -R[3] + R[4]*im
			@inbounds G[2*i + 1,2*j+1+ 1] = R[3] + R[4]*im
		end
	end
      ev = svdvals!(G)
	return ev[2*N]*N
end

function data(n,samples)
	G = Array{Complex{Float64}}(undef, 2n ,2n)
      R = zeros(4)
	for i in 1:samples
	   println(symplectic(n,G,R))
	end
end

n = 100
samples = 1000
data(n,samples)
@time data(n,samples)

And the difference (to 2Gib) is using svdvals! (note the exclamation mark)

Edit: I am not sure if this is doing the same as your code, exactly. But appropriate use of adjoint! and mul! could reduce the allocations even further.

patrick · September 25, 2020, 11:27pm

Interesting, thank you. I reproduce your drastic decrease in memory usage, to about 220 MiB. However, the time it takes to run the script about doubles when using svdvals!.

When I compute the singular values “by hand” by first doing G * G' and then eigvals! (note the exclamation mark), it runs in about half the time as the svdvals! version but uses about 5 times as much memory. This is still about half the memory usage of the non-exclamation mark version (the original), very roughly.

I am puzzled why the dedicated svd routine is twice as slow as the direct computation I’m using.

stevengj · September 26, 2020, 12:04am

It’s more accurate. Computing G * G' squares the condition number.

axsano · September 27, 2020, 9:13pm

Hi,

The correct equivalence between SVD and EIG is when,

M = G * G’
ev_eig = eigvals(M)
ev_svd = svdvals(M)

And hence, for i=1,...,N.

ev_svd[i] == abs(ev_eig[N+1-i])

stevengj · September 27, 2020, 11:03pm

(Normally you would take the SVD of G, not M, in which case you would take the square roots of ev_eig. Also, you might then have to omit some zero eigenvalues for non-square or rank-deficient G.)

axsano · September 27, 2020, 11:09pm

Thanks, @stevengj, my apologizes by my misconception

Topic		Replies	Views
svdvals is alarmingly slow Performance	31	3048	March 15, 2018
Sparse matrix-vector product: much more slow than Matlab Performance matlab , optimization	24	4638	December 20, 2017
Help to reduce number of allocations General Usage	23	5559	January 6, 2019
Why is this Julia code considerably slower than Matlab New to Julia performance	64	8622	March 5, 2017
Unexpected 4x slowdown when computing many svds in loop Performance	4	716	August 4, 2019

Can this random matrix simulation be sped up?

Related topics