A Unified Interface for Rootfinding

ChrisRackauckas · December 2, 2016, 4:52pm

One of the most useful developments of Julia is the concept of the metapackage. A package management system coupled with multiple dispatch has lead to ecosystems like Plots.jl, DiffEqBase.jl (DifferentialEquations.jl), and MathProgBase.jl (JuMP) which allow you to use the same high level code to plug into many different packages in the same regime. This has been not only a big boost for users (there’s one interface to know!), but it’s actually been crazy amazing for package development: you can write a package which targets one interface and now all of the methods are available.

To build off of these successes, I would like to propose some kind of RootFindingBase.jl. Building off of previous interfaces, what I would like to see is something like:

find_zeros(f,interval,x0,alg(),kwargs...)

which would dispatch find_zeros using the type of alg to different solver methods in different packages. Candidates that I know of include:

Roots.jl
Sundials.jl (KINSOL)
NLsolve.jl

The idea is so that way I could so something like

find_zeros(f,(0.0,1.0),0.5,kinsol(linear_solver=banded),abstol=1e-5)

to call Sundials.jl and

find_zeros(f,(0.0,1.0),0.5,nlsolve(autodiff=true),abstol=1e-5)

to call NLsolve.jl

The reason why this is important to me is because it would allow me to write algorithms at this level, and let the user choose any nonlinear solver which fits the problem (and also easily benchmark between them!). (This will be really nice for JuliaDiffEq because then implicit methods could easily swap out rootfinding algorithms!).

The questions going forward are:

Are package maintainers on-board to add this kind of interface?
What are the problems with the current proposal?
How do we make it robust to the different types of rootfinding problems which exist?

Some details to discuss are:

The form of f. Allow both inplace and not inplace (allow not inplace for the univariate case, and auto-convert to an inplace function for problems on vectors?)
Not every method needs an interval or an initial condtion. What do we do here?
What should the solution type look/act like?
Anything else you can think of.

Tamas_Papp · December 3, 2016, 4:12pm

I think that the idea is nice but it remains to be seen whether you can unify rootfinding in a single interface. Perhaps it would be best if you made a library that does this, similarly to Plots.jl — I guess most problems will surface when you start doing this and can be dealt with then. Regarding syntax, I think you could try following Optim.jl’s conventions, with something like

find_zeros(functionobject,         # would also specify inplace, derivatives, like Optim
           initial_x_or_interval,  # a point for quasi-Newton, interval for bisection
           optimizer_object,       # eg kinsol(...)
           general_options_object) # tolerances, whether to do AD, ...

dlfivefifty · December 4, 2016, 5:22am

How would global rootfinders fit in, as in ApproxFun’s roots?

ChrisRackauckas · December 6, 2016, 10:42pm

Maybe separate local and global rootfinding problems?

dlfivefifty · December 7, 2016, 9:53am

I wonder if a better idiom is to follow DifferentialEquations.jl, so that you’d setup up a problem on which you call roots.

ChrisRackauckas · December 13, 2016, 3:24am

Yeah, what should the problem types be LocalNonlinearSolve and GlobalNonlinearSolve? Or is more necessary?

dlfivefifty · December 13, 2016, 3:30am

Having Solve in the problem name doesn’t seem right… I’d say something like RootProblem and RootsProblem to replicate eig vs eigs.

ChrisRackauckas · December 13, 2016, 4:18am

I don’t like the difference between s and not. I think it should be a bit more explicit: GlobalRootProblem vs LocalRootProblem.

ChrisRackauckas · December 13, 2016, 8:20pm

This all doesn’t seem too hard to actually do and would get some really nice/usable results: would it be a good Google Summer of Code project?

ChrisRackauckas · December 14, 2016, 6:34am

After roaming around the package ecosystem a bit, here’s what I got so far.

Two types: LocalRootProblem and GlobalRootProblem.
Both take a general f. This would allow packages like Roots.jl to dispatch on typeof(f)==Polynomial, or ApproxFun.jl to write rootfinding dispatches on Fun.
Both have an optional argument for interval = (-Inf,Inf). Some methods may need to error if these values are Inf, but this default means “all real numbers”. (Is there a way or a need to generalize this to complex?).
Optional argument for a Jacobian. Again, this should be left with no type-restriction. It’s an optional argument because of dispatch, with default being nothing.
A LocalRootProblem also has an optional (or keyword?) argument for the initial condtional x0.

Maybe we should just go with keyword arguments because order is hard to choose, and keyword args should dispatch soon enough in Julia.

I propose usage which looks like this. We can accept two function definitions. A non-inplace version is good for univariate problems:

type LocalRootProblem{F,G,probType} <: RootProblem
  f::F
  g::G # Jacobian
  interval::Tuple{probType,probType}
  x0::probType
end

...

Usage:

f(x) = 2x
g(x) = 2 # The Jacobian
prob = LocalRootProblem(f,init=x0,interval=(0.0,1.0),jac=jac)

This makes a problem where the Jacobian is known while

prob = LocalRootProblem(f,init=x0,interval=(0.0,1.0))

has the second type-parameter Void which tells the solver that the Jacobian does not exist (a function has_jac(prob::RootProblem{F,G,IType}) = G!=Void can be used for this check at compile-time). Then it could be like:

find_zeros(prob,
                  nlsolve(nlsolve_specific_option=...),
                  general_options_object)

matching Optim.jl like @Tamas_Papp mentioned. The in-place version for functions would be

function f!(x,out)
  out[1] = 2*x[1]
end

which would be the same equation as above. GlobalRootProblem should then be similar.

One thing I noticed is that the problem type should match both x0 and the interval. This would be the type that the solution is found in. Thus one can set the solvers to use BigFloat values by setting interval=(big(a),big(b)) or x0 = big(...). I believe any method has to either set an initial condition or an interval to solve on, so this will always work. Correct me if I’m wrong about that.

Are there any holes in this proposal? If not, I think I could make a new package in JuliaMath and put up a first draft and put some PRs out to Roots.jl, Sundials.jl, and NLsolve.jl. Then I know @dlfivefifty wants to make dispatches for ApproxFun.jl on Fun types, and so it’ll already gain some adoption. What’s a good name for the package? RootBase.jl?

Tamas_Papp · December 14, 2016, 6:59am

@ChrisRackauckas: nice summary. I would suggest the following:

Embed the the function and its derivatives (particularly the Jacobian) in something like Optim’s DifferentiableFunction (or even the same thing, would be nice not to proliferate types). This would also allow approximations like from ApproxFun you mention above.
replace interval with domain, which can be ℝ (high time we had a type representing that), [a,∞), [a,b] (see the issue for IntervalSets.jl I link below), and Cartesian products of these (for box constraints along coordinates), of course for multivariate we need a type for this too.

https://github.com/JuliaMath/IntervalSets.jl/issues/9

dlfivefifty · December 14, 2016, 9:23am

To be clear, there’s three very different places where ApproxFun fits in:

roots(f::Fun) in ApproxFun/src/Extras/roots.jl is an algorithm (from Chebfun) for finding all roots, and could be one of the methods for solving GlobalRootProblem.
The implementation of roots(f::Fun) could pass through GlobalRootProblem and potentially use other algorithms besides the one in ApproxFun/src/Extras/roots.jl
A generalization of (local) root finding is solving ODEs, which is currently implemented with Newton iteration:

x = Fun()
N = u->[u(-1.)-c;u(1.);ε*u''+6*(1-x^2)*u'+u^2-1.]
u = newton(N,u0)

This could potentially be solved by a unified interface by constructing a LocalRootProblem whose probType is Fun

ChrisRackauckas · December 15, 2016, 4:36am

What’s the purpose of the DifferentiableFunction type? Why does it help?

I like this idea of using domains. It’s finding the right implementation that would be the problem.

Tamas_Papp · December 15, 2016, 8:36am

I am probably not the best person to ask about it, but see this discussion and PR:

github.com/JuliaNLSolvers/Optim.jl

RFC: Revise Unconstrained and Box Constrained Optimization API

opened 11:10PM - 23 Mar 13 UTC

closed 02:37PM - 18 Apr 13 UTC

johnmyleswhite

enhancement

We currently have a mixed API for optimization: the functions I wrote work very …differently from the functions that Tim wrote. To unify things, I'd like to propose a new API that I hope we can all standardize on. The proposal is quite long, but I think it touches on all of the issues we need to confront. I'm opening it as an issue because I expect we'll want to debate the design for a while before implementing anything. To simplify the discussion, let's introduce some notation. My API worked exclusively with pure functions, which I'll refer to as: - `f` denotes a function from R^n to R. The result is returned as a `Real` of some sort. - `g` denotes the gradient of `f`, which makes `g` a function from R^n to R^n. The result is returned as a `Vector{T}` for some `Real` type `T`. - `h` denotes the Hessian of `f`, which makes `h` a function from R^n to R^n*m. The result is returned as a `Matrix{T}` for some `Real` type `T`. Tim's API employed mutating functions, which I'll refer to as: - `f` is the same as above. - `g!` denotes the gradient of `f`, but one which mutates an input argument so that it is called as `g!(storage, x)`. I'd like to transition this over to `g!(x, storage)`. As will be seen, I'd also like to remove the `nothing` arguments being used in the current implementation. _Because function is impure, nothing is returned_. - `h!` denotes the Hessian of `f`, but one which mutates an input argument so that it is called as `h!(storage, x)`. Again, I'd like to transition this over to `h!(x, storage)`. _Because function is impure, nothing is returned_. - `fg!` denotes a coupled pair of function and gradient that get evaluated simultaneously for efficiency. This coupled pair is called as `fg!(x, storage)` and returns the value of `f` evaluated at `x` after mutating `storage`. One could also consider functions like `gh!` and `fgh!`, but I'm not currently aware of a proposed use for those things. Using this notation, my proposed new API is the following: - We should entirely remove the use of `g` and `h` and enforce the use of mutating functions `g!` and `h!`. This may confuse some users, but I think the gains are worth the pain. - We should allow automatic creation of methods for `g!` and `h!` using finite differencing. I've already added the ability to do finite-differencing by mutating an array to the Calculus package in preparation for this. - We should construct ad hoc immutable types that can wrap up functions like `f`, `g!` and `fg!` into a single unit that permits multiple dispatch. These differentiable functions will become the core backend construct of the Optim package. End-users will not need to provide them, because we will generate these values for users automatically. But users who want to exploit forms like `fg!` can use these types, which will prevent the automatic creation of wrappers. Specifically, I propose creating the following types and methods: ``` immutable OnceDifferentiableFunction f::Function g!::Function end immutable CoupledOnceDifferentiableFunction f::Function g!::Function fg!::Function end immutable TwiceDifferentiableFunction f::Function g!::Function h!::Function end ``` Using these functions, we could create methods like the following: For pure function calls: - `callf(Function, x)` - `callf(OnceDifferentiableFunction, x)` - `callf(CoupledOnceDifferentiableFunction, x)` - `callf(TwiceDifferentiableFunction, x)` For mutating gradient function calls: - `callg!(OnceDifferentiableFunction, x, storage)` - `callg!(CoupledOnceDifferentiableFunction, x, storage)` - `callg!(TwiceDifferentiableFunction, x, storage)` For mutating Hessian function calls: - `callh!(TwiceDifferentiableFunction, x, storage)` For simultaneous function and mutating gradient function calls: - `callfg!(OnceDifferentiableFunction, x, storage)` - `callfg!(CoupledOnceDifferentiableFunction, x, storage)` - `callfg!(TwiceDifferentiableFunction, x, storage)` Using these types and functions, we should be able to express all of the computations we're doing now while doing much less memory allocation. Also, the use of multiple dispatch should make it easier to do redirection by automatically creating gradients when needed.

github.com/JuliaNLSolvers/Optim.jl

[RFC] New Optimization API

JuliaNLSolvers:master ← JuliaNLSolvers:mutate

opened 10:47PM - 24 Mar 13 UTC

johnmyleswhite

+1329 -1234

This pull request represents the current _unfinished_ state of my work on a new …API for unconstrained optimization in Julia. It implements the ideas described in #16. There's still another week or so of work to do before this work will be ready for merging, but I wanted to get a demo out there to get early feedback. It would be good to hear from @andreasnoackjensen about the changes I've made to the backtracking line search function. In my ad hoc testing, I've found that completely removing the loop in which the step-size can increase above `1.0` substantially lowers the number of iterations required to reach convergence. This may be an artifact of the problems I've tried out, but I'd like to make sure that we push as hard as possible on improving the performance of these functions.

In particular this explanation.

johnmyleswhite · December 15, 2016, 3:40pm

I don’t think DifferentiableFunction is that useful really (as it could clearly be replaced by tuples of functions), but I believe people intend to make more complex use of it in the future by introducing things like memoization. If they weren’t, I would propose getting rid of it. But the indirection could be put to good use when building more complex features.

simonbyrne · December 15, 2016, 4:00pm

I’ve recently been thinking that this sort of thing might now be better handled using traits, e.g.

has_gradient(::Function) = false # generic fallback


function rosenbrock(x::Vector)
    return (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
end
has_gradient(::typeof(rosenbrock)) = true
function gradient!(::typeof(rosenbrock), x::Vector, storage::Vector)
    storage[1] = -2.0 * (1.0 - x[1]) - 400.0 * (x[2] - x[1]^2) * x[1]
    storage[2] = 200.0 * (x[2] - x[1]^2)
end

Not sure how this would work generally though

mauro3 · December 15, 2016, 7:12pm

This is a bit how we do it over in DiffEqBase: https://github.com/JuliaDiffEq/DiffEqBase.jl/blob/6cc12def8af8e7e51b458d65f48e43d97aee4af1/src/extended_functions.jl

pkofod · December 18, 2016, 9:29pm

Up front: I know a lot can be done using proper documentation and educational material, but…

Don’t you think the average user of a general optimization package might be confused by this way of specifying it?

That being said, we really don’t use the whole (Twice)DifferentiableFunction-machinery for much right now, but if you look through the issues, there are some different ideas that people have for using it, and I do intend to see (soon) if we can implement some of these ideas or if we should ditch it all together.

ChrisRackauckas · January 23, 2017, 7:37pm

Where exactly is this at now? I kind of got lost but want to revive this.

j_verzani · January 24, 2017, 12:56am

FWIW Roots.jl had a rewrite to be able to support this effort once an API is decided on.

Topic		Replies	Views
Packages for Optimization General Usage question , package , jump , optimization	22	1481	December 1, 2020
Is there a faster bisection root solver (that uses atol)? General Usage question , roots	22	4223	August 13, 2018
Autodiff-ing a function defined by the result of Roots.find_zero fails General Usage roots , forwarddiff , autodiff	12	794	September 29, 2022
Complex root finder for a general function f(z)? New to Julia question , package , roots	40	5120	May 5, 2020
Finding saddle point General Usage optim	36	3666	May 29, 2020

A Unified Interface for Rootfinding

Related topics