[Ann] IntervalRootFinding.jl for finding all roots of a multivariate function

Can I specify the biggest 4 roots or smallest 3 roots to find and avoid computing the rest? This can have application in finding some of the eigenvalues of a tridiagonal matrix using the characteristic polynomial which is not hard to build for tridiagonal matrices. This can in turn be used in the Lanczos method with implicit restarts. Also are there any complexity guarantees for the single variable case?

I suppose if the range is halved every iteration that it takes around log_2(\frac{b-a}{\epsilon})*d where d is the number of roots desired, [a,b] is the initial range and \epsilon is the precision. Correct me if I am wrong. If that’s the case, that’s pretty good if lousy bounds can be obtained for the eigenvalues. Since every polynomial evaluation is O(n) where n is the size of the tridiagonal matrix, the whole eigenvalue problem should be O(nd) which is not bad.

Assume we deal only with continuous functions.

In 1-d the (Lefshetz)-index of a root is +1 if the function is increasing at the zero, and -1 if it is decreasing. The index of an interval with no roots at the boundary is zero if f has the same sign on both ends, +1 if it is negative on the left and positive on the right, and -1 if it is positive on the left and negative on the right.

An interval with nonzero index must contain a root (intermediate value theorem).

In n dimensions the index of a rectangle without zeros on the boundary could be computed by the following: Consider f as a map from the boundary to the unit sphere (normalize the output); this induces a map in homology (boundary is a topological sphere), which has the form x → lambda*x; then lambda is the index.

The index of a regular (invertible jacobian) root is the sign of the determinant of the jacobian at the zero. In this case, this is also the index of a tiny rectangle containing the root. The second definition allows us to also define the index of an isolated zero for continuous (not differentiable) functions, but this is more of theoretical value.

Theorem: If f has no roots on the boundary of a region, and 0 is a regular value (all roots have invertible jacobian), then the index of the region is the sum of the indices of all the roots. Also, if a region has nonzero index then there must be a root inside (might be singular, though).

I would guess that you are computing the indices of rectangles anyway (even if maybe under a different name); the output would be useful for me.

The index also has very nice properties with respect to pertubations (bifurcation theory): Suppose you continuously deform the f, such that there are no roots on the boundary for all of the deformation. Then, roots can be created and destroyed inside; but, we see that generically roots must be born in pairs (one with positive index, one with negative index), can move around, etc; and can die if a positive and a negative root collide.

edit: For an example for non {+1,-1,0}-index consider z->z^2 in the complex plane; the root has index 2, and indeed, small pertubations of the function will split the root (generically two solutions to quadratic equations). Preferred output would tell me: There is this tiny rectangle with index 2; outside there are no roots. Checking whether we have one or two roots is generally very hard: Identity for computable numbers is undecideable after all.

…and a second question (maybe more like feature request):

Can I give a tolerance for my f? That is, sure I have an f, but I am really interested in some g close to f.

Application: g is e.g. the solution to some differential equation; f is the output of a diffeq solver. Now I have a discretization error (not coming from floating point) that would probably need to be user-supplied.

Oh, nice, root-finding gives computer-assisted proofs of various properties of the solution g. Then I plug this into my favorite mathematical machinery (eg conley index) to gain a computer-assisted proof that, for example, my differential equation is indeed chaotic in this region, and that other region contains an attractor, or another region does not intersect the global attractor.

I have not seen software for such purposes yet that was not a total pain to use, and have not seen any viable ecosystem for computer-assisted proofs in analysis yet. That might be my blindness, though: I am coming at this from the analysis side, not the proof-assistant side, nor the numerics side.

Something that makes computer-assisted proofs more accessible for mathematicians in analysis, or even practitioners in applied fields, would be a huge boon, imho.

Pinging @ChrisRackauckas for comment. Example for a really cool blog-post would e.g. a short computer-assisted proof of the chaoticity of the Lorenz attractor (short julia code, citation of the papers explaining the mathematical machinery).

While not a computer assisted proof (that’s kind of different), DynamicalSystems.jl has tooling for detection of chaos.

The tolerance question is, in some sense, automatic: interval methods can only deal with quantities that are open sets.

The chaoticity of the Lorenz attractor is unfortunately far out of reach of a short blog post, as far as I know.
We are also still a way away from having a rigorous ODE integrator (but it’s in process…)

Which software have you seen that is a pain to use? At least we should be able to do somewhat better with Julia…

Thanks for the explanation about the Lefschetz index.
Unfortunately I don’t know how to actually go about calculating this or using it (without actually calculating all the roots).

That’s a good question.
The only way that occurs to me is to bias the bisection procedure to look for either big or small values first, and to quit once it has found the requisite number of roots. This seems plausible and not too difficult to implement (?)

What is the efficient algorithm to construct the characteristic polynomial for tridiag matrices?

Oops, I see that interval arithmetic is actually not explained in the docs.

The basic idea is to calculate with entire sets of real numbers, of which the simplest type are closed intervals [a,b] := \{x \in \mathbb{R}: a \le x \le b \}.

We define arithmetic operations and functions to act on intervals, in such a way that the result of the calculation is a new interval that is guaranteed to contain the true range of the function.

For example, for monotone functions like exp, we define

exp([a, b]) := [exp(a), exp(b)]

For non-monotone functions, like the squaring function, it is more complicated:

[a, b]^2 := [a^2, b^2]  if 0 < a < b
          = [0, max(a^2, b^2)]  if a < 0 < b
          = [b^2, a^2] if a < b < 0

We also have to round the lower endpoint down and the upper endpoint up to get guaranteed containment of the true result, since we are using floating-point arithmetic.

Once we have done this for all basic functions, we can define a complicated Julia function like

f(x) = sin(3x^2 - 2 cos(1/x))

and feed an interval in. Since at each step of the process, the result is an interval that is guaranteed to contain the range, the whole function has the same property.

For example,

using IntervalArithmetic

julia> using IntervalArithmetic

julia> f(x) = x^2 - 2
f (generic function with 1 method)

julia> X = 3..4
[3, 4]

julia> f(X)
[7, 14]

Since f(X) does not contain 0, the true range of the function f over the set X is guaranteed not to contain 0, and hence we have

Theorem: The function f has no root in the interval [3,4].

This theorem has been obtained using just floating-point calculations!

Further, we can even extend this to semi-infinite intervals:

julia> f(3..∞)
[7, ∞]

Thus we have excluded the entire domain [3, ∞) from possibly containing roots of f.

To move beyond just excluding regions and to actually guaranteeing existence and uniqueness for smooth functions, we use an interval version of the Newton method, which is described a bit here.

8 Likes

But you are calculating a union of tiny non-overlapping rectangles that contain all roots; and for each rectangle, you compute an answer that is either “yes, there is a unique root inside here” or “dunno; maybe? Can’t exclude roots.” (correct me if I’m wrong)

I think you could quite cheaply compute the index for the unique ones, as well as the index for the “maybe” rectangles.

For unique-by-newton (invertible jacobian) you use the sign of the determinant of the jacobian; for the others, I am not sure (you would need a slightly bigger rectangle that contains all the many nonsense rectangle in the 2nd below example; then one would need a fast way to compute the Lefshetz-index from the boundary of the rectangle, dropping dimensions until we are at intervals. I’ll have to look at your code to see whether most of the relevant computations are done already and think a little about how I would go about this).

So, in other words, each output rectangle should (hopefully) also carry its index.

Surprising examples:

reg=Complex((-1.0)..(1.0), (-1.0)..(1.0));

f(z)=z^2;
rts = roots(f, reg)
1-element Array{IntervalRootFinding.Root{Complex{IntervalArithmetic.Interval{Float64}}},1}:
 Root([0, 0] + [0, 0]*im, :unique)

f(z)=(z-0.1)^2;
rts = roots(f, reg)
10-element Array{IntervalRootFinding.Root{Complex{IntervalArithmetic.Interval{Float64}}},1}:
 Root([0.101562, 0.101804] + [0, 0.000995636]*im, :unknown)   
 Root([0.100585, 0.101065] + [0, 0.000909573]*im, :unknown)   
 Root([0.0996093, 0.100586] + [0, 0.000976563]*im, :unknown)  
 Root([0.0986328, 0.0996094] + [0, 0.000917576]*im, :unknown) 
 Root([0.0983273, 0.0986329] + [0, 0.00097156]*im, :unknown)  
 Root([0.101562, 0.101804] + [-0.000995636, 0]*im, :unknown)  
 Root([0.100585, 0.101065] + [-0.000909573, 0]*im, :unknown)  
 Root([0.0996093, 0.100586] + [-0.000976563, 0]*im, :unknown) 
 Root([0.0986328, 0.0996094] + [-0.000917576, 0]*im, :unknown)
 Root([0.0983273, 0.0986329] + [-0.00097156, 0]*im, :unknown) 

f(z)=(z-Complex(0.1,0.2))^2;
rts = roots(f, reg)
ERROR: no promotion exists for IntervalRootFinding.Compl{IntervalArithmetic.Interval{Float64}} and Complex{Float64}

edit: Also thank you for the explanation of interval arithmetic!

It’s a Fibonacci-like recurrence which can be looped over in reverse. For a symmetric tridiagonal n-by-n matrix A, if a_i is the i^{th} diagonal element and b_i is i^{th} off-diagonal element then the determinant of the leading i-by-i submatrix of A-\lambda I is p_i(\lambda) = (a_i-\lambda)p_{i-1}(\lambda)-b_{i-1}^2p_{i-2}(\lambda) for i = 2:n using p_0(\lambda)=1. p_n is the determinant of the matrix for a given \lambda.

OK thanks.

It’s easy enough to couple this with the complex root finder to get rigorous bounds on the eigenvalues!
The only issue will be eg. multiple roots, which will be returned as a single :unknown root, or roots that are too close together with the given tolerance, which will be returned in the same way.

Ya that would be cool!

You mean that it will find the eigenvalue but not its multiplicity? That’s not too bad.

Edit: post-processing the unknown intervals by an optimization algorithm that minimizes residual is also an option. If the minimum residual is not 0, then there is no eigenvalue in that interval, otherwise there is one with potentially large multiplicity. The multiplicity can then be found by a different algorithm if needed. It may work with IntervalConstraintProgramming.jl? I am not entirely sure what’s happening under the hood in IntervalRootFinding so excuse my ignorance.

1 Like

Hmm, the first is surprising indeed, since there is a double root that should not be able to be classed as unique?

The second is for this reason; the double root leads to a so-called “cluster effect” of unclassifiable boxes near the root.

The third is due to some silly promotion nonsense that may not be so trivial to fix correctly?

It would definitely be neat to be able to calculate the index, but it’s still not clear to me how to go about it.

One way of starting would be in 2d with f(xy)= ( (x,y)=xy; (g(x,y), h(x,y)) . We have a bunch of rectangles which might contain solutions (the regular roots are already done via the jacobian). Merge overlapping / adjecent rectangles.

Now we have a weirdly shaped region which may contain roots, but cannot contain roots on the boundary. I am assuming that the boundary is a union of tiny intervals I_k; and for each of the I_k, you have shown (via interval arithmetic) that g(I_k) and h(I_k) have no zeros (these are 1-d maps). Is this how you proceed internally?

Then, we simply walk the chain of intervals and verify that the boundary of the combined region is a 1-d sphere (consider the graph, where two intervals I_k are adjacent if they share an end-point; is this a cycle?). Then we count the winding number.

This way, we would compute the index without a single additional evaluation of the target function.

For higher dimensions, we might need to employ homology solvers for cubical complexes. I never worked with cubical complexes, so I’ll need to think / read about how this works, and google whether somebody wrote something for julia already.

1 Like

but this is a nice writeup, please think about putting it in the docs

1 Like

This is not quite right since g(I_k) and h(I_k) may have zeros.
What we have shown is that there are no simultaneous roots of g(I_k) and h(I_k).

Also, in general there may be roots in the small boxes on the boundary.
Further, the weirdly shaped region in general may be disjoint and non-simply connected.

Is there a reference I can look at for this stuff?

Isn’t this task impossible? How do you handle an infinite number of roots?

There is a limiting tolerance (box size).
Once that size is reached, the algorithm returns a box that contains an infinite number of roots and is labelled :unknown.

e.g.:


julia> rts = roots(x->x * sin(1/x), 0..10, Bisection, 1e-5);

julia> rts = roots(x->x * sin(1/x), rts, Newton, 1e-5)
365-element Array{IntervalRootFinding.Root{IntervalArithmetic.Interval{Float64}},1}:
 Root([0.318309, 0.31831], :unique)
 Root([0.159154, 0.159155], :unique)
 Root([0.106103, 0.106104], :unique)
 Root([0.0795774, 0.0795775], :unique)
 Root([0.0636619, 0.063662], :unique)
 Root([0.0530516, 0.0530517], :unique)
 Root([0.0454728, 0.0454729], :unique)
 Root([0.0397887, 0.0397888], :unique)
 Root([0.0353677, 0.0353678], :unique)
 Root([0.0318309, 0.031831], :unique)
 ⋮
 Root([8.58306e-05, 9.53675e-05], :unknown)
 Root([7.62939e-05, 8.58307e-05], :unknown)
 Root([6.67572e-05, 7.6294e-05], :unknown)
 Root([5.72204e-05, 6.67573e-05], :unknown)
 Root([4.76837e-05, 5.72205e-05], :unknown)
 Root([3.81469e-05, 4.76838e-05], :unknown)
 Root([2.86102e-05, 3.8147e-05], :unknown)
 Root([1.90734e-05, 2.86103e-05], :unknown)
 Root([9.53674e-06, 1.90735e-05], :unknown)
 Root([0, 9.53675e-06], :unknown)
1 Like

If you give a smaller and smaller tolerance, in principle it can find more and more of the infinite number of roots.