A Unified Interface for Rootfinding

One of the most useful developments of Julia is the concept of the metapackage. A package management system coupled with multiple dispatch has lead to ecosystems like Plots.jl, DiffEqBase.jl (DifferentialEquations.jl), and MathProgBase.jl (JuMP) which allow you to use the same high level code to plug into many different packages in the same regime. This has been not only a big boost for users (there’s one interface to know!), but it’s actually been crazy amazing for package development: you can write a package which targets one interface and now all of the methods are available.

To build off of these successes, I would like to propose some kind of RootFindingBase.jl. Building off of previous interfaces, what I would like to see is something like:

find_zeros(f,interval,x0,alg(),kwargs...)

which would dispatch find_zeros using the type of alg to different solver methods in different packages. Candidates that I know of include:

  1. Roots.jl
  2. Sundials.jl (KINSOL)
  3. NLsolve.jl

The idea is so that way I could so something like

find_zeros(f,(0.0,1.0),0.5,kinsol(linear_solver=banded),abstol=1e-5)

to call Sundials.jl and

find_zeros(f,(0.0,1.0),0.5,nlsolve(autodiff=true),abstol=1e-5)

to call NLsolve.jl

The reason why this is important to me is because it would allow me to write algorithms at this level, and let the user choose any nonlinear solver which fits the problem (and also easily benchmark between them!). (This will be really nice for JuliaDiffEq because then implicit methods could easily swap out rootfinding algorithms!).

The questions going forward are:

  1. Are package maintainers on-board to add this kind of interface?
  2. What are the problems with the current proposal?
  3. How do we make it robust to the different types of rootfinding problems which exist?

Some details to discuss are:

  1. The form of f. Allow both inplace and not inplace (allow not inplace for the univariate case, and auto-convert to an inplace function for problems on vectors?)
  2. Not every method needs an interval or an initial condtion. What do we do here?
  3. What should the solution type look/act like?
  4. Anything else you can think of.
8 Likes

I think that the idea is nice but it remains to be seen whether you can unify rootfinding in a single interface. Perhaps it would be best if you made a library that does this, similarly to Plots.jl — I guess most problems will surface when you start doing this and can be dealt with then. Regarding syntax, I think you could try following Optim.jl’s conventions, with something like

find_zeros(functionobject,         # would also specify inplace, derivatives, like Optim
           initial_x_or_interval,  # a point for quasi-Newton, interval for bisection
           optimizer_object,       # eg kinsol(...)
           general_options_object) # tolerances, whether to do AD, ...
1 Like

How would global rootfinders fit in, as in ApproxFun’s roots?

2 Likes

Maybe separate local and global rootfinding problems?

I wonder if a better idiom is to follow DifferentialEquations.jl, so that you’d setup up a problem on which you call roots.

Yeah, what should the problem types be LocalNonlinearSolve and GlobalNonlinearSolve? Or is more necessary?

Having Solve in the problem name doesn’t seem right… I’d say something like RootProblem and RootsProblem to replicate eig vs eigs.

1 Like

I don’t like the difference between s and not. I think it should be a bit more explicit: GlobalRootProblem vs LocalRootProblem.

2 Likes

This all doesn’t seem too hard to actually do and would get some really nice/usable results: would it be a good Google Summer of Code project?

After roaming around the package ecosystem a bit, here’s what I got so far.

  • Two types: LocalRootProblem and GlobalRootProblem.
  • Both take a general f. This would allow packages like Roots.jl to dispatch on typeof(f)==Polynomial, or ApproxFun.jl to write rootfinding dispatches on Fun.
  • Both have an optional argument for interval = (-Inf,Inf). Some methods may need to error if these values are Inf, but this default means “all real numbers”. (Is there a way or a need to generalize this to complex?).
  • Optional argument for a Jacobian. Again, this should be left with no type-restriction. It’s an optional argument because of dispatch, with default being nothing.
  • A LocalRootProblem also has an optional (or keyword?) argument for the initial condtional x0.

Maybe we should just go with keyword arguments because order is hard to choose, and keyword args should dispatch soon enough in Julia.

I propose usage which looks like this. We can accept two function definitions. A non-inplace version is good for univariate problems:

type LocalRootProblem{F,G,probType} <: RootProblem
  f::F
  g::G # Jacobian
  interval::Tuple{probType,probType}
  x0::probType
end

...

Usage:

f(x) = 2x
g(x) = 2 # The Jacobian
prob = LocalRootProblem(f,init=x0,interval=(0.0,1.0),jac=jac)

This makes a problem where the Jacobian is known while

prob = LocalRootProblem(f,init=x0,interval=(0.0,1.0))

has the second type-parameter Void which tells the solver that the Jacobian does not exist (a function has_jac(prob::RootProblem{F,G,IType}) = G!=Void can be used for this check at compile-time). Then it could be like:

find_zeros(prob,
                  nlsolve(nlsolve_specific_option=...),
                  general_options_object)

matching Optim.jl like @Tamas_Papp mentioned. The in-place version for functions would be

function f!(x,out)
  out[1] = 2*x[1]
end

which would be the same equation as above. GlobalRootProblem should then be similar.

One thing I noticed is that the problem type should match both x0 and the interval. This would be the type that the solution is found in. Thus one can set the solvers to use BigFloat values by setting interval=(big(a),big(b)) or x0 = big(...). I believe any method has to either set an initial condition or an interval to solve on, so this will always work. Correct me if I’m wrong about that.

Are there any holes in this proposal? If not, I think I could make a new package in JuliaMath and put up a first draft and put some PRs out to Roots.jl, Sundials.jl, and NLsolve.jl. Then I know @dlfivefifty wants to make dispatches for ApproxFun.jl on Fun types, and so it’ll already gain some adoption. What’s a good name for the package? RootBase.jl?

1 Like

@ChrisRackauckas: nice summary. I would suggest the following:

  1. Embed the the function and its derivatives (particularly the Jacobian) in something like Optim’s DifferentiableFunction (or even the same thing, would be nice not to proliferate types). This would also allow approximations like from ApproxFun you mention above.

  2. replace interval with domain, which can be ℝ (high time we had a type representing that), [a,∞), [a,b] (see the issue for IntervalSets.jl I link below), and Cartesian products of these (for box constraints along coordinates), of course for multivariate we need a type for this too.

https://github.com/JuliaMath/IntervalSets.jl/issues/9

1 Like

To be clear, there’s three very different places where ApproxFun fits in:

  1. roots(f::Fun) in ApproxFun/src/Extras/roots.jl is an algorithm (from Chebfun) for finding all roots, and could be one of the methods for solving GlobalRootProblem.
  2. The implementation of roots(f::Fun) could pass through GlobalRootProblem and potentially use other algorithms besides the one in ApproxFun/src/Extras/roots.jl
  3. A generalization of (local) root finding is solving ODEs, which is currently implemented with Newton iteration:
x = Fun()
N = u->[u(-1.)-c;u(1.);ε*u''+6*(1-x^2)*u'+u^2-1.]
u = newton(N,u0)

This could potentially be solved by a unified interface by constructing a LocalRootProblem whose probType is Fun

What’s the purpose of the DifferentiableFunction type? Why does it help?

I like this idea of using domains. It’s finding the right implementation that would be the problem.

I am probably not the best person to ask about it, but see this discussion and PR:

In particular this explanation.

I don’t think DifferentiableFunction is that useful really (as it could clearly be replaced by tuples of functions), but I believe people intend to make more complex use of it in the future by introducing things like memoization. If they weren’t, I would propose getting rid of it. But the indirection could be put to good use when building more complex features.

I’ve recently been thinking that this sort of thing might now be better handled using traits, e.g.

has_gradient(::Function) = false # generic fallback


function rosenbrock(x::Vector)
    return (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
end
has_gradient(::typeof(rosenbrock)) = true
function gradient!(::typeof(rosenbrock), x::Vector, storage::Vector)
    storage[1] = -2.0 * (1.0 - x[1]) - 400.0 * (x[2] - x[1]^2) * x[1]
    storage[2] = 200.0 * (x[2] - x[1]^2)
end

Not sure how this would work generally though

3 Likes

This is a bit how we do it over in DiffEqBase: https://github.com/JuliaDiffEq/DiffEqBase.jl/blob/6cc12def8af8e7e51b458d65f48e43d97aee4af1/src/extended_functions.jl

Up front: I know a lot can be done using proper documentation and educational material, but…

Don’t you think the average user of a general optimization package might be confused by this way of specifying it?

That being said, we really don’t use the whole (Twice)DifferentiableFunction-machinery for much right now, but if you look through the issues, there are some different ideas that people have for using it, and I do intend to see (soon) if we can implement some of these ideas or if we should ditch it all together.

Where exactly is this at now? I kind of got lost but want to revive this.

FWIW Roots.jl had a rewrite to be able to support this effort once an API is decided on.