Robust solver for ℝ³ → ℝ³ root with flat parts

I have an ℝ³ → ℝ³ function f which is differentiable (the computation uses an implicit solution, but I implemented the derivatives for Enzyme.jl, Mooncake.jl, and ForwardDiff.jl). I would like to find the root f(x) = 0. (Sorry, 4k LOC, no MWE).

I am using simulated data, so I am certain that the root exists (and actually I can recover f(x_0) \approx 0 using the known root x_0).

The problem is that while f is differentiable and smooth, parts of it are “flat”, as shown in this plot using GLMakie:

The flat region does not admit a simple characterization, and the solvers I tried fail if they venture there. Eg I tried all the solvers in NonlinearSolve.jl, and they occasionally run into this region and report the return code Stalled.

I wonder how I could avoid this.

Maybe use a global optimization algorithm with a coarse tolerance to get sufficiently close to the root, and then use a Newton method?

In general it seems like you’ll need some global-search phase to escape from a region where the function is constant (or is it just nearly constant?).

The problem is that while f is differentiable and smooth, parts of it are “flat”,

I’m not sure what you mean by “smooth”, here. (Infinitely differentiable? Analytic?) From your plot, it looks like the derivative has a discontinuity.

1 Like

Thanks. Can you recommend a Julia package? Or should I do a simple grid search?

Once differentiable, with continuous derivatives (it’s just that they change in a small region around the edges, which is why the plot looks the way it does).

You can probably do better than a grid search. For only 3 variables with a Lipshitz continuous function, optimizing to low accuracy, I’ve had good success with the DIRECT derivative-free algorithm. For exploiting derivatives in global optimization, I like MLSL. Both of these are in NLopt.jl, for example. But there are so many algorithms in this space, and it’s best to experiment for a given problem.

2 Likes

So, I ended up trying all the solvers in NLOpt.jl, then various robust solvers in Optimization.jl, and none of them work reliably (more than 90% of the random tests). Then I remembered that from the setup of this problem, it is “almost separable”, that is, coordinate j mostly affects the residual j, and end up coding a simple coordinatewise solver:

"""
$(SIGNATURES)

Solve ``f(x) ≈ 0``, using `f!(y, x)` where ``y = f(x)``. Stops at `iter_max` iterations,
when the coordinates move less than `xabs`, or the residuals are below `ytol`.

Uses a coordinate-wise univariate method. For smooth problems, this is likely to be
inferior to almost all solvers under the sun. But for nondifferentiable residuals which
are monotonic in the corresponding input, it can work. As a last resort.
"""
function coordinatewise_solver(f!, x0, lb, ub;
                               iter_max = 100, xabs = √eps(eltype(x0)), ytol = xabs,
                               method = Roots.Brent())
    x = collect(x0)             # make it a vector
    N = length(x)
    z = similar(x)              # a buffer
    y = similar(x)              # residuals
    Δmin = oftype(xabs, Inf)
    result(status, i) = (; x, y, Δmin, status, iterations = i)
    function f_j(j)
        z .= x
        function(xj)
            z[j] = xj
            f!(y, z)
            y[j]
        end
    end
    for i in 1:iter_max
        Δmin = oftype(xabs, Inf)
        for j in 1:N
            fj = f_j(j)
            if abs(fj(x[j])) > ytol
                # safety depends on Roots not doing threading
                xj = Roots.find_zero(fj, (lb[j], ub[j]))
                Δmin = min(abs(xj - x[j]), Δmin)
                x[j] = xj
            end
        end
        f!(y, x)
        maximum(abs, y) ≤ ytol && return result(:converged, i)
        Δmin ≤ xabs && return result(:small_moves, i)
    end
    result(:iter_max, iter_max)
end

This works 100% of the time. It is also the method I use as a silly strawman when I teach numerical methods in economics, so this is kind of karmic.

2 Likes

Just to mention another possibility: you could multiply the function componentwise, where you multiply f1 by e^{x1^2}, f2 by e^{x2^2} etc. This preserves roots and smoothly distorts all flat areas (and should not introduce new ones unless your function is a Gaussian density…).

2 Likes