How to decrease number of allocations when using NLSolve.jl?

With reference to my last post , I am trying to solve a set of non-linear equations known as the Coupled Cluster Equations in Quantum Chemistry using the NLSolve.jl package. As per the reply of @alfaromartino , in the same post, I have restructured my code in a way that is compatible with nlsolve() method from NlSolve.jl. Below is the code

First Code
using NLsolve
include("cc-helper.jl")
function compute_parameters()
    R2u,R2,g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,K,fvv,foo,erhf = initialize_cc_variables();
    tup = g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,fvv,foo,R2u,R2,erhf
    return tup
end
function compute_result(tup)
    g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,fvv,foo,R2u,R2,erhf = tup
    function foo1(T2) 
        # This is a placeholder function. I had to write it this way because this is the only type pf function that nlsolve accepts
        calculate_residual(g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,T2,fvv,foo,R2u,R2)
    end
    function nl_solve()
        # Cannot understand what to write here, for now let's leave it empty
    end
    # OR here
    T2 = initialize_t2_only();
    sol=nlsolve(foo1, T2)
    finalamps = sol.zero;
    ECCD = calculate_ECCD(g_oovv,finalamps);
    etot = ECCD[1]+erhf;
    return finalamps,etot;
end
tup = compute_parameters();
@time finalamps,etot=compute_result(tup);

Now this produces the expected result without any error, however, this is not memory efficient as the residuals can be passed as an argument as given in the documentation of the package and also advised by @alfaromartino . So, I restructured my code to achieve the same as given below

New code
using NLsolve
include("cc-helper.jl")
function compute_parameters()
    R2u,R2,g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,K,fvv,foo,erhf = initialize_cc_variables();
    tup = g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,fvv,foo,R2u,R2,erhf
    return tup
end
function compute_result(tup)
    g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,fvv,foo,R2u,R2,erhf = tup
    function foo1(F::AbstractArray{Float64,4},T2::AbstractArray{Float64,4}) 
        # This is a placeholder function. I had to write it this way because this is the only type pf function that nlsolve accepts
        @views F[:,:,:,:]=calculate_residual(g_vvoo,g_voov,g_vovo,g_oovv,g_oooo,g_vvvv,T2,fvv,foo,R2u,R2)
    end
    function nl_solve()
        # Cannot understand what to write here, for now let's leave it empty
    end
    # OR here
    T2 = initialize_t2_only();
    sol=nlsolve(foo1, T2)
    finalamps = sol.zero;
    ECCD = calculate_ECCD(g_oovv,finalamps);
    etot = ECCD[1]+erhf;
    return finalamps,etot;
end
tup = compute_parameters();
@time finalamps,etot=compute_result(tup);

However, the I observe no significant change in allocations between these two functions. Both report (excluding compilation time)

 0.611511 seconds (3.78 M allocations: 286.140 MiB, 8.17% gc time)

I even used the @views macro to ensure that the splicing syntax does not initialize a new tensor and instead reuses the memory. Any ideas what is going on up here and what I might do to fix this?

A few things. First, change to NonlinearSolve.jl and change to the interface that allows for re-using the cache between solves as described in the other post:

using NonlinearSolve
function f!(F,p)
    R2 = calculate_residual(p...);
    F .= R2
    nothing
end
p = initialize_cc_variables();
T2guess = initialize_t2_only()
prob = NonlinearProblem(f!, T2guess, p)
cache = init(prob)
solve!(cache)
finalamps1 = copy(cache.u)

cache.p = new_parameters
solve!(cache)
finalamps2 = copy(cache.u)

Then make calculate_residual actually in-place, i.e. make it mutate F instead of creating a few vector. You’ve already lost if it returns a new vector.

Isn’t there any way to achieve the same without switching to NonlinearSolve.jl ? Because I would need to use Pulay mixing to accelerate the convergence of these equations. However, the solve() from the NonlinearSolve.jl does not innate support for different methods like NLSolve.jl does. If I want to make it happen I would still have to use NLSolve.jl within the SciML interface.

NonlinearSolve.jl has tons more methods than NLSolve.jl. In fact, it includes NLSolve.jl itself in that method set.

I would be needing Pulay Mixing, which I could not find in the full list of methods given.

Can you open an issue?

Hi,
Unfortunately I am very new to this language and the community. I don’t know how to “open an issue” in this regard. Some pointers would be really helpful.

https://arxiv.org/pdf/2002.12850 might be interesting. It seems that DIIS (Pulay) methods are closely related to Anderson acceleration and krylov methods, so it might be worth seeing if this is just solvable using the existing solvers configured to use these strategies.

The NLSolve.jl does support the Anderson method, so that was my end goal - to solve the Coupled Cluster Equations using the nsolve() function and using method:=anderson argument. However, right now I am faced with two problems

  1. I cannot seem to reduce the allocations when using NLSolve.jl (this is still a minor problem) and the solution seemed to involve switching over to the newer NonlinearSolvers.jl .

  2. Then comes the major problem (please advice me on whether or not should I open a different topic about it) - My input variable T2 is a rank 4 tensor, and the function I am trying to find the roots of takes T2 and returns R2 which is another rank 4 tensor. So, the Jacobian is going to be a rank 8 tensor. Now, I want to manually input the Jacobian into the solver, but it seems to only support Jacobian “Matrices” as in rank 2 tensors.