Creating a matrix as Symmetric causes excessive allocations in multithreaded loop

In the following code snippet, I use a symmetric matrix to calculate another. If I assign the matrix as Symmetric, it leads to excessive allocations if the loop is multi-threaded.

using BenchmarkTools
function foo()
    ns = 1500
    ne = 200
    rbar = rand(Float64, (ne,ne)) * 30
    #rbar = Symmetric(rbar, :L)  # comment this line to see the difference
    potzl = rand(ComplexF64, (ne,ne))
    zls = [Array{ComplexF64}(undef, (ne,ne)) for t = 1:Threads.nthreads()]
    Threads.@threads for f = 1:ns
        t = Threads.threadid()
        zl = zls[t]
        g = rand(ComplexF64)
        for k = 1:ne
            for i = k:ne
                exp_gr = exp(-g * rbar[i,k]) # FIXME exp very slow
                zl[i,k] = exp_gr * potzl[i,k]
            end
        end
    end
end
precompile(foo, ())
@btime foo()

The output of @btime without commenting rbar = Symmetric(rbar, :L) is 3.168 s (180900039 allocations: 4.95 GiB. If I comment that line, the output of @btime is 287.963 ms (36 allocations: 3.67 MiB). Why does that happen?

Here @code_warntype foo() shows a problem. The issue is that you’re using the same name for two objects of differing type, which sometimes the compiler is smart enough to understand, but not always. If you give the first one a different name, the problem goes away.

5 Likes

See also the performance tips: avoid changing the type of a variable.

2 Likes