Problem with NaN's when plugging minimizing vectors (NLopt) back into function

Hello, first off I don’t have a working example due to research privacy and I know that will make things much harder to debug. But I would like to see if anyone recognizes this problem by chance.

I have a function, which I minimize using NLopt and the algorithm PRAXIS. This goes smoothly, I take the minimizer (which I name minx) and copy the vector itself, and paste it into another cell and label it something to save it for later.

When I then go back to plug the vector back into the function later to get the results again, all I get returned from the function is NaN, and when I @show various parts of the function, they are all NaN’s as well. I have not had any other problems with this function and it behaves normally otherwise.

When plugging minx (I have NLopt save the minimizer to minx immediately) back into the function immediately after the minimizing run, I have no problems, I get the results that I should and no NaN’s. Therefore the problem is in copying and saving the vector, and then restarting the kernel, losing the saved minx, and trying to plug back in the same vector (now saved as a different variable) back into the function. I don’t change anything about the function or the minx array between restarts.

Let me know if this rings any bells for anyone. If not, that’s okay. It is not holding me back too much it is just very strange and a bit frustrating.

1 Like

Have you tried to replicate the problem without the optimization? Or have you been able to determine at which step the problem arises. It is very difficult to help you with a so very generalized problem statement.

I will try to narrow it down some. Something else of note is that (say I save a minimizing vector as X0) if I pass X0 to NLopt to use as a starting vector, nothing goes wrong, the optimization run goes normally. And the problem only happens with vectors I have previously used in the optimization. So the problem doesn’t arrive normally, passing other random vectors to the function doesn’t give any NaN’s.

Without a minimal working example, it’s impossible to say. However,

I take the minimizer (which I name minx) and copy the vector itself, and paste it into another cell and label it something to save it for later.

You’re probably not copying the full Float64 value. Consider

julia> rand(3)
3-element Array{Float64,1}:
0.5154380997179793
0.4773473251211171
0.7388211634374753

julia> [rand(3)]
1-element Array{Array{Float64,1},1}:
[0.45449, 0.408123, 0.295315]
2 Likes

Thank you, this is very helpful. I am working on a working example now.

Here is a working example. With D_2 = 4, I get NaN when trying to use XD8G200, but 1 when I pass a random vector, and I get 1 when using XD8G200 with either D_2 = 2 or D_2 = 3. tr(ExpTL) should always be 1. I have found when I remove 0.5* from the construction of F, I don’t get NaN but I get zero not 1 so that is not a change I can make. There might be a problem with the math itself and if that is a case, I will take another look at it but so far I have found no errors.

D_2 = 4
D = 2*D_2 

using LinearAlgebra
 

function ETest(x::Array)
    
    S1 = [0 1; 0 0] 
    
    A = reshape(Complex.(x[1:2:2*D_2^2-1],x[2:2:2*D_2^2]),D_2,D_2)
    B = reshape(Complex.(x[2*D_2^2+1:2:4*(D_2^2)-1],x[2*D_2^2+2:2:4*(D_2^2)]),D_2,D_2)
    
    
    C = kron(S1,Diagonal(ones(D_2)))
    E = kron(Diagonal(ones(2)),A) + kron(S1,B)
    
    
    H = reshape(x[4*D_2^2+1:4*D_2^2+(2D_2)^2],D,D)
    H = Hermitian(Complex.(H,LowerTriangular(H)'-Diagonal(H)))
    
    
    
    F = -im*H - 0.5*(C'*C + E'*E)
    
    I = kron(F,Diagonal(ones(D))) + kron(Diagonal(ones(D)),conj(F)) + kron(C,conj(C)) + kron(E,conj(E))

    J = x[4*D_2^2+(2D_2)^2+1]
    G = x[4*D_2^2+(2D_2)^2+2]
    
    L = 2.0
    ExpT = exp(I) 
    ExpTL = ExpT^L


    epsilon = norm(ExpTL-ExpT,2)/1_000_000
    
    while norm(ExpTL-ExpT,2) > epsilon
        ExpT = ExpTL
        ExpTL = ExpT*ExpT
    end
    
    
    return  tr(ExpTL)
end

XD8G200 = [-1.59374,-5.2499, 1.67174, -1.98958, -0.647899, 6.07679, 1.75827, -4.27265, 2.96154, -1.68131, -0.723875, -2.21622, -1.43416, 0.746147, 5.63919, 1.9931, 2.90512, 0.742848, -3.68103, 0.2776, -2.41628, 3.89632, -8.49975, -4.41796, -5.06176, -3.10374, 8.03138, -1.23213, 1.58675, 2.90112, 1.55164, 2.37669, -2.04646, 0.924121, 0.139059, -2.68708, 1.31853, -0.58986, 2.27582, -0.669315, -0.365731, -2.18405, 1.05335, 0.182093, 0.261764, 1.15751, 1.45095, 1.20171, -0.224569, 2.11413, -1.10776, -2.29912, 0.32957, -1.24015, 0.927298, -0.693715, 0.168881, 1.13747, -0.161536, -0.747931, -0.0125401, -0.577182, -0.31807, 0.0709998, 1.21963, -3.78828, 3.18148, -5.90588, -6.8576, 2.22467, 24.6244, -23.6736, 0.922213, -2.682, -1.59649, 4.73233, 20.2889, 1.37649, -27.4573, 44.7773, 6.01401, -1.93724, 3.19959, 1.72178, 2.2031, -0.980421, -13.4228, 12.3053, -0.405479, 3.12535, -0.256148, -0.0570012, 2.87966, 0.364477, -12.8079, 12.7975, 24.8966, -10.8566, -15.2349, -10.306, -5.34033, -3.3093, 4.65935, -4.26519, 4.51724, -2.40256, -0.495797, 0.31801, -2.48692, -2.72646, -5.89435, 5.61049, -16.2209, -8.38797, 13.4124, 8.94201, 6.46931, 0.158825, 3.15498, -10.7349, 41.1833, -9.70869, -30.2226, -22.3293, -6.48997, -1.12033, -0.123085, -0.573403, 1.67974, -1.66205];
ETest(XD8G200) 
#or
ETest(randn(4*D_2^2+(2D_2)^2+2))

Don’t use globals, there is no reason the D and D_2 cannot be just arguments to ETest.

I don’t know what you are trying to do here, but

    while norm(ExpTL-ExpT,2) > epsilon
        ExpT = ExpTL
        ExpTL = ExpT*ExpT
    end

is effectively just squaring ExpTL after the first step, a numerically unstable algorithm. ExpTL just blows up. This is the part that is giving you NaNs with the given XD8G200 value.

1 Like

Okay, I have been worried about that loop for a long time. Here’s what I wanted it to do:
I have a matrix ExpT = e^(IL). In my calculations, I need to use the value of ExpT as L goes to infinity. Basically this matrix will converge at some point as L increases, and I want to use that converged value. This loop was the way that I tried to “take L to infinity” with lack of a better construction. Do you have any suggestions for how to better construct this loop in order to do what I would like it do to?

I would recommend looking at the famous “19 dubious ways” article, especially decomposition methods (Section 6), but I did not study your problem in detail.