How to employ nlsolve() to a function which has parameters which are difficult to calculate

Basically, I am trying to solve the Coupled Cluster Equations using Julia and the package NLsolve which can be used to solve such nonlinear equations. My initial strategy was to define a function as follows that accepts only one matrix as a parameter


function f_old(T2)
    R2u,R2,g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,K,fvv,foo,erhf = initialize_cc_variables();
    R2 = calculate_residual(g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,T2,fvv,foo,R2u,R2);
    R2
end

T2guess = initialize_t2_only()
sol=nlsolve(f_old, T2guess)
finalamps = sol.zero
R2u,R2,g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,K,fvv,foo,erhf = initialize_cc_variables()
ECCSD = calculate_ECCSD(g_oovv,finalamps)
ECCSD[1]+erhf

which gives the output

 * Inf-norm of residuals: 0.000000
 * Iterations: 15
 * Convergence: true
   * |x - x'| < 0.0e+00: false
   * |f(x)| < 1.0e-08: true
 * Function Calls (f): 16
 * Jacobian Calls (df/dx): 16

The variables g_voooo etc are very difficult to compute, and hence it’s a bad idea to initialize them every time the function f is called (they remain constant throughout every iteration). So I tried to follow this post by @ufechner7 , and as per the post, I modified my function and succeeding statements as follows

function f!(F,T2,g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,fvv,foo,R2u,R2)
    R2 = calculate_residual(g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,T2,fvv,foo,R2u,R2);
    F = R2
    F
end
R2u,R2,g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,K,fvv,foo,erhf = initialize_cc_variables();
T2guess = initialize_t2_only()
sol = nlsolve((F,T2)->f!(F,T2,g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,fvv,foo,R2u,R2), T2guess)
finalamps = sol.zero
ECCSD = calculate_ECCSD(g_oovv,finalamps)
ECCSD[1]+erhf

But the later gives me the output

 * Inf-norm of residuals: 0.067053
 * Iterations: 1000
 * Convergence: false
   * |x - x'| < 0.0e+00: false
   * |f(x)| < 1.0e-08: false
 * Function Calls (f): 1
 * Jacobian Calls (df/dx): 1

which seems to be incorrect because it seems like it converges in just one iteration? Also the the quantity ECCSD[1]+erhf in the later case has a less negative value (a better solution should produce a more negative value of this quantity since this is basically the electronic energy). Any help is greatly appreciated !

Unfortunately, the structure of NLSolve requires a function that doesn’t accept parameters as arguments
What I usually do is define a parent function that defines the parameters
Something similar to this

compute_parameters() = (a=1, b=2, c=3)
tup                  = compute_parameters()

function compute_result(x,y, tup)
    a,b,c = tup         # destructure parameters

    function foo(x,y)   
        x + y + a + b + c   # body of computations
    end

    function nl_solve()
        # define initial conditions and call nlsolve based on foo(x,y)
    end

    nlsolve()       # define nlsolve to return the result
end

With this approach, be careful of type instabilities, as you’re working with closures.
Also, note you can define the residuals as an argument, which improves the performance of NLSolve. That’s described on the package’s website.

Maybe this can be of some help in terms of the structure I followed above. It also uses the residual as a function argument.

1 Like

Thank you for your idea, I tried to follow the structure you provided for the context of my code. As far as I understand it, instead of x,y I would have the tensor T2 , which is essentially what I am solving for. However I cannot understand exactly to put inside the nl_solve() function that you suggested and how to proceed forward with actually calling the solver. Below is what I have written so far :

function compute_parameters()
    R2u,R2,g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,K,fvv,foo,erhf = initialize_cc_variables();
    tup = g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,fvv,foo,R2u,R2
    return tup
end
function compute_result(T2,tup)
    g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,fvv,foo,R2u,R2 = tup
    function foo1(T2)
        calculate_residual(g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,T2,fvv,foo,R2u,R2)
    end
    function nl_solve()
        # Cannot understand what to write here
    end
    # OR here
end

Where the calculate_residual(...) function takes the aforementioned parameters and outputs another tensor. The entire goal is to find such a T2 such that the calculate_residual function gives a zero tensor as output. It must be noted that all other parameters other than T2 that calculate_residual(...) accepts are either constants or simply workspaces. Below is the full definition of the function

function calculate_residual(g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,T2,fvv,foo,R2u,R2)
    @tensoropt begin
        R2u[a_1,a_2,i_1,i_2]= 0.5* g_vvoo[a_1,a_2,i_1,i_2] - g_voov[a_1,i_3,i_1,a_3] * T2[a_2,a_3,i_3,i_2] - g_vovo[a_2,i_3,a_3,i_2] * T2[a_1,a_3,i_1,i_3]- g_vovo[a_2,i_3,a_3,i_1] * T2[a_1,a_3,i_3,i_2]+ 2*g_voov[a_1,i_3,i_1,a_3] * T2[a_2,a_3,i_2,i_3]+ 0.5*g_oooo[i_3,i_4,i_1,i_2] * T2[a_1,a_2,i_3,i_4]+ fvv[a_2,a_3] * T2[a_1,a_3,i_1,i_2]+ 0.5*g_vvvv[a_1,a_2,a_3,a_4] * T2[a_3,a_4,i_1,i_2]- foo[i_3,i_2] * T2[a_1,a_2,i_1,i_3]+ 2*(g_oovv[i_3,i_4,a_3,a_4] * T2[a_1,a_3,i_1,i_3]) * T2[a_2,a_4,i_2,i_4]- 2 *(g_oovv[i_3,i_4,a_3,a_4] * T2[a_1,a_3,i_1,i_3]) * T2[a_2,a_4,i_4,i_2]+ 0.5*(g_oovv[i_3,i_4,a_3,a_4] * T2[a_1,a_3,i_3,i_1]) * T2[a_2,a_4,i_4,i_2]- 2*(g_oovv[i_3,i_4,a_3,a_4] * T2[a_3,a_4,i_3,i_2]) * T2[a_1,a_2,i_1,i_4]- 1*(g_oovv[i_3,i_4,a_3,a_4] * T2[a_1,a_4,i_1,i_3]) * T2[a_2,a_3,i_2,i_4]+ (g_oovv[i_3,i_4,a_3,a_4] * T2[a_1,a_4,i_1,i_3]) * T2[a_2,a_3,i_4,i_2]+ (g_oovv[i_3,i_4,a_3,a_4] * T2[a_2,a_4,i_4,i_3]) * T2[a_1,a_3,i_1,i_2]+ (g_oovv[i_3,i_4,a_3,a_4] * T2[a_3,a_4,i_2,i_3]) * T2[a_1,a_2,i_1,i_4]- 2*(g_oovv[i_3,i_4,a_3,a_4] * T2[a_2,a_3,i_4,i_3]) * T2[a_1,a_4,i_1,i_2]+ 0.5*(g_oovv[i_3,i_4,a_3,a_4] * T2[a_1,a_4,i_3,i_2]) * T2[a_2,a_3,i_4,i_1]+ 0.5* (g_oovv[i_3,i_4,a_3,a_4] * T2[a_3,a_4,i_1,i_2]) * T2[a_1,a_2,i_3,i_4]
     end
     @tensor begin
        R2[a,b,i,j] := R2u[a,b,i,j] + R2u[b,a,j,i]
     end
     return R2
end

Please let me know what can be done so that this package can be used to solve this set of non-linear equations.

NonlinearSolve.jl accepts parameters and SciML deprecated all use of NLsolve.jl to NonlinearSolve.jl for this reason.

using NonlinearSolve
function f!(F,p)
    R2 = calculate_residual(p...);
    F .= R2
    nothing
end
p = initialize_cc_variables();
T2guess = initialize_t2_only()
prob = NonlinearProblem(f!, T2guess, p)
sol = solve(prob)
finalamps = sol.u

You can use this in a way that reuses the same cache across solves, just changing parameters, for performance:

using NonlinearSolve
function f!(F,p)
    R2 = calculate_residual(p...);
    F .= R2
    nothing
end
p = initialize_cc_variables();
T2guess = initialize_t2_only()
prob = NonlinearProblem(f!, T2guess, p)
cache = init(prob)
solve!(cache)
finalamps1 = copy(cache.u)

cache.p = new_parameters
solve!(cache)
finalamps2 = copy(cache.u)