Finding saddle point

I’m trying to find the saddle point of a function. I’ve programmed a function saddle below (but if you know of a library to find saddle points, please let me know). The idea is that the input function should be specified in a way such that the first argument is a vector for which the function is minimized, and the second argument is a vector for which the function is maximized.

using Optim
function saddle(f::Function, initx, inity; method = NelderMead())
    # function is assumed to be f(xmin, ymax)
    ymax = similar(inity)
    function fx(x)
        optymax = optimize(y -> -f(hcat(x, y)), inity, method)
        ymax = Optim.minimizer(optymax)
        return -Optim.minimum(optymax)
    end
    optxmin = optimize(fx, initx, method)
    xmin = Optim.minimizer(optxmin)
    return (f(hcat(xmin, ymax)), xmin, ymax)
end

Just wanted to check if this an good way to program a function that finds saddle points. What do you think?

First, really consider using gradients, unless the dimension of your problem is very small (note that this can be implemented easily in your approach: the gradient of the inner function fx is simply df/dx, there are no extra terms). Second (assuming gradients), this two-level optimisation method is likely to be suboptimal (because you’re optimizing y to high precision even if x is far from the minimum. I think I would use NLSolve.jl with anderson acceleration on a flipped gradient iteration (Xn+1 = Xn - alpha * [df/dx;-df/dy]). CC @cortner who might have better ideas.

1 Like

It’s nice to see other people trying to solve nonlinear saddle point problems!

We’ve been developing various saddle point search methods in
SaddleSearch.jl and Isaac.jl. These are experimental packages coming our of a research project, hence not registered, but mostly due to lack of time. If anybody other than myself actually wanted to use them then that would motivate me to invest more time :). What makes saddle point search so interesting is that there are no algorithms with convergence guarantees!

I think there are three typical situations and it is important to understand which of these you are in:

(1) you have a good guess for a saddle point => just use Newton’s method, and if you don’t want to implement hessians, then Newton-Krylov. Probably look at NLsolve.jl; That said, the idea of Isaac.jl is to “modulate” the Newton iteration in such a way that it repels from all critical points except for index-1 saddles.

(2) You have two potential minima and want to find a saddle that connects them => use the NEB (nudged elastic band) or String methods; you can find prototype implementations in SaddleSearch.jl.

(3) You start far away from any saddle => use the Dimer or GAD methods, these are very easy to implement yourself if you don’t care about performance, but a huge PITA to make both fast and robust; some variants are again implemented in SaddleSearch.jl

Regarding (2, 3), depending on your skill you may be able to modify those codes to use for your problem or just write your own from the original references. I worry they will not be immediately usable out of the box in SaddleSearch.jl and definitely not in Isaac.jl.

References

3 Likes

Can you elaborate on how to use NLsolve for this? Take first derivatives and solve for the zero of the system?

Wow, this is a lot. Probably most of these are an overkill for my application, but I will definitely take look. Thanks for this.

One thing I am getting from Julia is that those packages are not updated for 1.1. I first get a warning, when installing them:

┌ Warning: julia version requirement for package Isaac not satisfied
└ @ Pkg.Operations /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Operations.jl:183
┌ Warning: julia version requirement for package SaddleSearch not satisfied
└ @ Pkg.Operations /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Operations.jl:183

And then they can’t precompile:

julia> using SaddleSearch
[ Info: Precompiling SaddleSearch [c59cb286-31ed-5500-91c1-888ad4f8e0ac]
ERROR: LoadError: LoadError: syntax: extra token "IterationLog" after end of expression
Stacktrace:
 [1] include at ./boot.jl:326 [inlined]
 [2] include_relative(::Module, ::String) at ./loading.jl:1038
 [3] include at ./sysimg.jl:29 [inlined]
 [4] include(::String) at /Users/amrods/.julia/packages/SaddleSearch/ecca3/src/SaddleSearch.jl:1
 [5] top-level scope at none:0
 [6] include at ./boot.jl:326 [inlined]
 [7] include_relative(::Module, ::String) at ./loading.jl:1038
 [8] include(::Module, ::String) at ./sysimg.jl:29
 [9] top-level scope at none:2
 [10] eval at ./boot.jl:328 [inlined]
 [11] eval(::Expr) at ./client.jl:404
 [12] top-level scope at ./none:3
in expression starting at /Users/amrods/.julia/packages/SaddleSearch/ecca3/src/misc.jl:23
in expression starting at /Users/amrods/.julia/packages/SaddleSearch/ecca3/src/SaddleSearch.jl:9
ERROR: Failed to precompile SaddleSearch [c59cb286-31ed-5500-91c1-888ad4f8e0ac] to /Users/amrods/.julia/compiled/v1.1/SaddleSearch/4fEhD.ji.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1197
 [3] _require(::Base.PkgId) at ./loading.jl:960
 [4] require(::Base.PkgId) at ./loading.jl:858
 [5] require(::Module, ::Symbol) at ./loading.jl:853
julia> using Isaac
[ Info: Precompiling Isaac [5a2c6b2a-d244-58c1-9360-037670a2bcca]
┌ Warning: Package Isaac does not have ForwardDiff in its dependencies:
│ - If you have Isaac checked out for development and have
│   added ForwardDiff as a dependency but haven't updated your primary
│   environment's manifest file, try `Pkg.resolve()`.
│ - Otherwise you may need to report an issue with Isaac
└ Loading ForwardDiff into Isaac from project dependency, future warnings for Isaac are suppressed.
ERROR: LoadError: LoadError: LoadError: Only works on type-defs or named tuples.
Did you try to construct a NamedTuple but omitted the space between the macro and the NamedTuple?
Do `@with_kw (a=1, b=2)` and not `@with_kw(a=1, b=2)`.

Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] @with_kw(::LineNumberNode, ::Module, ::Vararg{Any,N} where N) at /Users/amrods/.julia/packages/Parameters/NholY/src/Parameters.jl:624
 [3] include at ./boot.jl:326 [inlined]
 [4] include_relative(::Module, ::String) at ./loading.jl:1038
 [5] include at ./sysimg.jl:29 [inlined]
 [6] include(::String) at /Users/amrods/.julia/packages/Isaac/0iMXE/src/Isaac.jl:2
 [7] top-level scope at none:0
 [8] include at ./boot.jl:326 [inlined]
 [9] include_relative(::Module, ::String) at ./loading.jl:1038
 [10] include(::Module, ::String) at ./sysimg.jl:29
 [11] top-level scope at none:2
 [12] eval at ./boot.jl:328 [inlined]
 [13] eval(::Expr) at ./client.jl:404
 [14] top-level scope at ./none:3
in expression starting at /Users/amrods/.julia/packages/Isaac/0iMXE/src/testsets.jl:46
in expression starting at /Users/amrods/.julia/packages/Isaac/0iMXE/src/testsets.jl:46
in expression starting at /Users/amrods/.julia/packages/Isaac/0iMXE/src/Isaac.jl:4
ERROR: Failed to precompile Isaac [5a2c6b2a-d244-58c1-9360-037670a2bcca] to /Users/amrods/.julia/compiled/v1.1/Isaac/O8bBO.ji.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1197
 [3] _require(::Base.PkgId) at ./loading.jl:960
 [4] require(::Base.PkgId) at ./loading.jl:858
 [5] require(::Module, ::Symbol) at ./loading.jl:853

Can you elaborate on how to use NLsolve for this? Take first derivatives and solve for the zero of the system?

exactly.

1 Like

@cortner note that this problem has an explicit saddle separation: min_x max_y f(x,y). That simplifies things a lot, and might actually be a good basis to think about robust algorithms

@amrods essentially yes. But you don’t know how to compute derivatives of F(x) = |nabla f(x)| so you don’t want to do a Newton-like iteration. Also Newton doesn’t see minima, saddles or maxima, it just sees critical points (and so if you start near a minimum it will converge to that). OTOH flipped gradient iteration converges locally to saddles (and not minima or maxima), so anderson acceleration on top of that (https://github.com/JuliaNLSolvers/NLsolve.jl#anderson-acceleration) may be a good bet (although it will still try to make it converge to any critical point).

correct - these packages have not yet been updated, hence my point that you can adapt the codes but can’t use them directly.

explicit saddle separation: min_x max_y f(x,y)

Is this really the general context or is this just a simplification? Indeed, this makes life a lot easier. Most likely, a nested solution would be best then, no? And fairly straightforward to implement?

That is what my application requires, if that helps. I’m essentially solving an optimization problem with constraints using the Lagrange method. So I need to find the saddle point of the Lagrangean function, and I place the multipliers in the last argument of the Lagrangean.

@amrods in that case wouldn’t you be better off using a constrained solver?

@cortner I would imagine it’s better to just couple everything, but the nested solve might be more robust

I was about to give the same answer. Please use a constrained optimisation solver if this is what your problem is!

1 Like

Especially since in the usual case the optimization wrt the dual variables is infinity…

I am trying to implement the method in this paper (here if you can’t download the first link). As I understand, there are some benefits of using the Lagrangean in this dynamic context, and I also would like to obtain the value of the multipliers.

This is then outside my domain, sorry. But consider maybe an augmented lagrangian formulation which redUces the Problem to a sequence of minimisation problems In a fairly elegant and efficient way. There is even an old package that @pkofod and I played around with somewhere.

I think a sequence of optimization problems would work. However I am also interested in the multipliers, I wonder if any of the optimization packages report them.

Most constrained optimisation algorithms will give you the multipliers as well as the optimum

1 Like

I doubt it’s super robust, but could be fun to run some simple benchmarks on problems with known solution. If only we had all the time in the world :slight_smile:

What’s the package? I can try it.