How to improve performance in a function that repeatedly defines and multiplies matrices

DNF · January 4, 2024, 9:23pm

To nitpick, ρ^2 * Wapp^-1 is not the same as Wapp \ ρ^2, but should be translated to ρ^2 / Wapp. Inside the determinant, it doesn’t matter much, though it leads to some floating point differences.

For performance (in addition to my doubts about the parallelization), there are many things that can be sped up, the most glaring one is perhaps

Uranium238:

for m in range(1,sysize)
        for mp in range(1,sysize)
            @. X11 = 0.0
            @. X12 = 0.0
            @. X21 = 0.0
            @. X22 = 0.0
            X11[m,:]=(J'*Si*J)[mp,:]*(-Bf[m,m])
            X12[m,:]=(J'*Bi*J)[mp,:]*(Bf[m,m])
            X21[m,:]=(J'*Si*J)[mp,:]*(Sf[m,m])
            X22[m,:]=(J'*Bi*J)[mp,:]*(-Sf[m,m])
            X = [X11 X12
                X21 X22]
            @. Y1 = 0.0
            @. Y2 = 0.0
            Y1[m,:] = Bf[m,m]*(J'*U*D)[mp,:] 
            Y2[m,:] = -Sf[m,m]*(J'*U*D)[mp,:]
            Y = [Y1
                Y2]
            g = im*tr(W\X)+    (transpose(W\V)*X*(W\V))[1]      -    (transpose(Y)*(W\V))[1]
            s = s+g[1]*gfcpart*T[m,mp]
        end
    end

where the products J' * Si * J etc. are performed over and over, instead of doing them once outside the loop. Also slicing along rows is slower than along columns, and the slices also allocate unnecessary temporary arrays, and the entire matrices X11, etc. are zeroed out for each iteration, even though only one row has been touched.

So there’s good news: at least GHT should be possible to speed up a lot

Topic		Replies	Views
How to improve the scaling of Julia code aimed at multi-node parallelization? Julia at Scale linearalgebra , distributed	38	834	August 14, 2024
Julia code becomes slower on running on supercomputers and does not scale well when parallelizing with Base.Threads Julia at Scale fortran , parallel , linearalgebra , threads	73	2453	January 22, 2024
How to convert a thread-parallelized code into a core-parallelized code? Julia at Scale multithreading , linearalgebra , distributed , threads , matrix	3	343	May 19, 2024
Probable data race condition causing problems when trying to parallelize a loop used to populate an array Performance distributed	14	262	August 4, 2024
Speeding up the multiplying, adding, subtracting of 3D matrices Numerics question	16	799	June 24, 2023

How to improve performance in a function that repeatedly defines and multiplies matrices

Related topics