How to solve set of nonlinear fixed-point equations using Anderson acceleration in NLsolve.jl?

I have a set of two nonlinear fixed-point equations like

x = g1(x, y)
y = g2(x, y)

where x and y are n-dimensional arrays (n=1,2,3), length(x) and length(y) can be very large, e.g. 128x128x128 for 3D problem. How can I solve it using anderson algorithm in the NLsolve.jl package which seems to be designed for x = g(x) only?
Thanks!

Note that your problem has precisely the form z = g(z) if you were to consider z := (x, y) and g := (g1, g2).

The only thing you have to do is to concatenate both variables and functions. Assuming x and y have the same size and dimensions, one way to do this would be
z = cat(x, y; dims=ndims(x)+1) # concatenate along a new dimension
And then a similar procedure for your functions.

But personally I am not a fan of this because explicit concatenation with these cat functions is not very efficient, especially for big arrays like yours. It also only works properly if x and y have the same size and typeof. On top of that, you’d need to throw @views everywhere to avoid further allocations.

So something much nicer would be to use an array of arrays such as
z = [x, y]
g(z) = [g1(z[1], z[2]), g2(z[1], z[2])]

But nlsolve can’t really operate on such containers.

However, it can work on slightly more refined containers such as the ones in RecursiveArraysTools, which are designed specifically for these tasks.

using RecursiveArrayTools
z = VectorOfArray([x, y])
g(z) = VectorOfArray([g1(z[1], z[2]), g2(z[1], z[2])])
nlsolve(g, z; method=:anderson)
3 Likes

Great! I will try them out and mark this post as the solution if everything goes well. Thanks!

By studying the source code of NLsolve, I am confused on which function should I serve the nlsolve.
It seems that nlsolve expect f(x) = 0 to solve. Thus, should I do

f(z) = g(z) - z
nlsolve(f, z; method=:anderson)

instead?

Another issue I have encountered is the way NLsolve defines its AndersonCache object. The line 24 in NLsolve.jl/src/solvers/anderson.jl is

mmax = min(length(x), m)

which is too restrictive for VectorOfArray objects. Here x is a VectorOfArray object, thus length(x) is the number of AbstractArrays in the object (it is 2 in this example). However, we expect mmax can be much larger because the length of each AbstractArray in the object can be very large, e.g. 128x128x128 for 3D problems.

At present, I wrote a constructor for AndersonCache by myself to allow any m.

A fixed-point equation f(z) = z can always be mapped to g(z) = 0 by precisely that recipe!
Note that NLsolve.jl has a wrapper for this, where instead of calling nlsolve you call fixedpoint and this is taken care for you https://github.com/JuliaNLSolvers/NLsolve.jl/blob/master/src/nlsolve/fixedpoint.jl#L30
But personally I would keep using the standard nlsolve function call: being more explicit helps in the long run.

Also note that m is NOT related with the size of your solution z (otherwise that min operation could have you end up with an underdetermined system!). Here you can find explained what m is for Anderson acceleration - Wikipedia

So if my fixed point is x=g(x), I should input f(x) = g(x) - x to nlsolve, right?

I know exactly what m does in Anderson mixing. The issue here is the Base.length defined by VectorOfArray only returns the number of vectors it contains. Thus If I use the NLsove provided version of AndersonCache constructor, m is restricted to be less than 2 (it always choose the minmum of m given and length(VectorOfArray)). I have written my own version of Anderson mixing long ago in C++ and I can use m as large as 100 to accelerate the convergence.

I marked this post as a solution as it provides a very good approach to represent a set of fixed-point equations involving many high-dimensional data. However, this approach cannot directly apply to NLsolve.anderson which deals with VectorOfArray objects in a way that it only allows <= length(VectorOfArray) rather than the length of multi-dimensional data.

I am ending up rewriting my own version of Anderson mixing which closely follows Walker’s pseudocode. (ie. its way to treat β relaxation). As Q*R*γ results in a flatten vector, we have to figure out a way to copy this long vector back into VectorOfArray{MyType} object. At present, I overload Base.copyto! to accomplish the task. But it is not universal.

Thanks for your input @FMeirinhos !

And here is my implementation (the intial_x has to be AbstractVectorOfArray):

function anderson_(df::Union{NonDifferentiable, OnceDifferentiable},
                            initial_x::AbstractArray,
                            xtol::Real,
                            ftol::Real,
                            iterations::Integer,
                            store_trace::Bool,
                            show_trace::Bool,
                            extended_trace::Bool,
                            β::Real,
                            aa_start::Integer,
                            droptol::Real,
                            cache::NLsolve.AndersonCache)
    @. cache.x = initial_x
    tr = NLsolve.SolverTrace()
    tracing = store_trace || show_trace || extended_trace

    aa_iteration = cache.γs !== nothing
    x_converged, f_converged, converged = false, false, false
    m = aa_iteration ? length(cache.γs) : 0
    aa_iteration && (m_eff = 0)

    # Preallocate Δf = f_{k+1}(x) - f_{k}(x)
    Δf = deepcopy(cache.fxold)

    iter = 0
    while iter < iterations
        # evaluate function: f(x) = g(x) - x
        value!!(df, cache.x)
        # Give a name to value(df). It will not allocate.
        fx = value(df)

        # check that all values are finite
        NLsolve.check_isfinite(fx)

        # Compute g(x) = x + f(x)
        @. cache.g = cache.x + fx

        res = zero(eltype(initial_x))
        for i in 1:length(initial_x)
            res += vecnorm1(fx[i]) / vecnorm1(cache.x[i])
        end
        res /= length(initial_x)

        if iter % 100 == 0
            @show iter, res
        end

        # check convergence
        converged = (res < ftol) ? true : false
        if any(isnan, cache.x) || any(isnan, fx)
            break
        end
        # converged = x_converged || f_converged
        if converged
            @show iter, res
            break
        end

        # perform Anderson acceleration
        if iter > aa_start && m > 0
            # increase size of history
            m_eff += 1
            @. Δf = fx - cache.fxold
            if m_eff < (m+1)
                @. cache.Δgs[m_eff] = cache.g - cache.gold
            else
                # circularly shift differences of g
                Δg1 = cache.Δgs[1]
                for i in 1:(m-1)
                    cache.Δgs[i] = cache.Δgs[i+1]
                end
                cache.Δgs[m] = Δg1
                @. cache.Δgs[m] = cache.g - cache.gold
            end
        end

        @. cache.fxold = fx
        @. cache.gold = cache.g

        if m_eff == 0 || m == 0
            if 0 < β ≤ 1.0
                @. cache.x += β * fx
            else
                @. cache.x = cache.g
            end
        else
            if m_eff == 1
                NLsolve.qradd!(cache.Q, cache.R, vec(Δf), m_eff)
            else
                if m_eff > m
                    # delete left-most column of QR decomposition
                    NLsolve.qrdelete!(cache.Q, cache.R, m)
                    m_eff -= 1
                end
                NLsolve.qradd!(cache.Q, cache.R, vec(Δf), m_eff)
            end

            # define current Q and R matrices
            Q, R = view(cache.Q, :, 1:m_eff), UpperTriangular(view(cache.R, 1:m_eff, 1:m_eff))

            # check condition
            if droptol > 0
                while cond(R) > droptol && m_eff > 1
                    NLsolve.qrdelete!(cache.Q, cache.R, m_eff)
                    m_eff -= 1
                    Q, R = view(cache.Q, :, 1:m_eff), UpperTriangular(view(cache.R, 1:m_eff, 1:m_eff))
                end
            end

            # solve least squares problem
            γs = view(cache.γs, 1:m_eff)
            ldiv!(R, mul!(γs, Q', vec(fx)))

            # update next iterate
            for i in 1:m_eff
                @. cache.g -= cache.γs[i] * cache.Δgs[i]
            end
            @. cache.x = cache.g
            if 0 < β < 1
                copyto!(Δf, Q * R * γs)
                @. cache.x -= (1-β) * (fx - Δf)
            end
        end
        iter += 1
    end

    return NLsolve.SolverResults("Anderson m=$m β=$β aa_start=$aa_start droptol=$droptol",
                        initial_x, deepcopy(cache.x), norm(value(df), Inf),
                        iter, x_converged, xtol, f_converged, ftol, tr,
                        first(df.f_calls), 0)
end