How to speed the function H in this code? (Hausdorff Chirality Measurement)

Luis_Manuel_Espinoza · September 13, 2023, 8:45pm

I am trying to do the Hausdorff Chirality Measurement from the article by Andrzej B. Buda and Kurt Mislow* but as I thought it is taking too long because to obtain a reliable value, I need to take “n” greater than 1,000,000 and it is with structures of 35 atoms It takes me almost 20 minutes, I was thinking about how to speed it up but I haven’t been able to make progress, I don’t know if anyone has an idea to speed it up, I would try to do it with parallel computing but I have no experience.

function Compute_of_CM_R_CCM(coords)
    # Compute the center of mass of the atoms
    n = size(coords, 1)
    center_of_mass = sum(coords, dims=1) / n
    # Compute the distances between the center of mass and each atom
    distances = zeros(n)
    for i in 1:n
        distances[i] = norm((center_of_mass.-coords)[i,:])
    end
    # Compute the radius of the smallest circle that encloses the atoms
    radius = maximum(distances)
    #Compute the coordinates translated to the center of mass
    coordinates_origin = coords .- center_of_mass
    
    return center_of_mass, radius, coordinates_origin
end

 function ρ(Q,Qp) 
    n = size(Q,1)
    sup = []
    for i in 1:n
        inf = []
        for j in 1:n
            dis = norm(Q[i,:].-Qp[j,:])
            push!(inf,dis)
        end
        push!(sup,minimum(inf))
    end
    return maximum(sup)    
end

function dQ(coordenadas)
    n = size(coordenadas,1)
    D = []
    for i in 1:n 
        distan = []
        for j in i+1:n
            d = norm(coordenadas[i,:]-coordenadas[j,:])
            push!(D,d)
        end
        end
    return maximum(D)
end

function H(q,qp,n)
    A = Compute_of_CM_R_CCM(q)[3]
    B = Compute_of_CM_R_CCM(qp)[3]
    HH = []
    for i in 1:n
        BB = B*qr(randn(3, 3)).Q
        HCM = max(ρ(A,BB),ρ(BB,A))/dQ(A)
        push!(HH,HCM)
    end
    return minimum(HH)
end

thanks for your time!

Sorry, I forgot to add an example of coordinates that are being used:

s = [ 1.73761 -2.3299 4.09897
   1.7145 -1.88395 2.6064
   0.810808 -3.53329 4.43137
   1.35042 -1.47855 4.72023
   2.18376 -2.67875 1.98346
   2.32905 -0.962011 2.49604
   3.08037 -3.63788 4.84995
   3.53686 -2.0441 5.09464
  -0.656357 -2.61487 3.56265
   3.11152 -2.69374 4.42536
   1.24236 -4.55928 4.94814
   2.0e-6 -1.53167 1.96559
  -0.51205 -3.38046 4.19649]

DNF · September 13, 2023, 9:20pm

There are many inefficiencies in your code, but I’ll point out two important ones:

Firstly,

Make sure to never initialize vectors like this. These create a Vector{Any}, which means that the compiler has no way of knowing what kind of values are contained therein, and therefore cannot create fast specialized instructions.

Secondly,

Luis_Manuel_Espinoza:

function dQ(coordenadas)
    n = size(coordenadas,1)
    D = []
    for i in 1:n 
        distan = []
        for j in i+1:n
            d = norm(coordenadas[i,:]-coordenadas[j,:])
            push!(D,d)
        end
        end
    return maximum(D)
end

Here (and other places, too) you painstakingly build a vector, but in the end you throw it away, just keeping the maximum. Instead just keep the running max:

function dQ(coordenadas)
    n = size(coordenadas,1)
    dmax = 0.0
    for i in 1:n 
        for j in i+1:n
             # faster: d = @views norm(coordenadas[:,i]-coordenadas[:,j])
            d = norm(coordenadas[i,:]-coordenadas[j,:]) 
            dmax = max(dmax, d)
        end
    end
    return dmax
end

Always make sure to not create arrays if you don’t need them.

There’s a lot of other things as well, but those two struck me particularly.

DNF · September 13, 2023, 9:23pm

BTW, if you want help to really speed this up, please provide example input data, so that posters can run the code themselves. Right now it’s not easy to guess what sort of input data should be used.

Luis_Manuel_Espinoza · September 13, 2023, 9:33pm

I’m sorry, I didn’t think it through

DNF · September 13, 2023, 9:35pm

Can you show how you call H? It takes three different input arguments, how does s relate to those?

Luis_Manuel_Espinoza · September 13, 2023, 9:43pm

Ahhh the second argument could be ss =-1*s and n is an Int that helps us increase the precision of the method, the function H is defined below the other functions above.

But thank you very much, the two observations you made previously have made me see those errors.

Topic		Replies	Views
Is it possible to make this function faster? Performance question	11	762	May 7, 2020
Could anyone make this run faster? Performance	21	587	December 26, 2022
Euclidean distance from one point Performance question	6	1203	June 17, 2020
Optimize code by parallelization/GPU Performance	8	556	October 12, 2022
Efficient way of computing norm between all the rows of a matrix New to Julia question , linearalgebra	5	624	October 24, 2021

How to speed the function H in this code? (Hausdorff Chirality Measurement)

Related topics