Tools for computer assisted proofs in analysis

I wanted to ask somewhat open-ended on what tools and packages exist in julia (and outside of julia) for computer assisted “numerical” proofs, and whether you know of examples where this has been done in the julia ecosystem.

Most numerical analysis concerns itself with approximations.

However, most pure mathematicians are in the business of proving theorems. We have beautiful fancy techniques to turn estimates into theorems, but these require a proof-grade estimates, not numerical error guesstimates.

In other words, the desired output of the numerical algorithm is not so much an approximation (akin to a search problem), but rather a verifiable certificate.

A very promising example is Lyapunov Function Search · SumOfSquares

In that example, one constructs a Lyapunov function of the form ax^2+by^2+c*z^2 with a,b,c > 1 for a specific rational 3d vectorfield.

Now @blegat – does the output of jump constitute mathematical proof, in your view, that the specific output coefficients give a Lyapunov function?

The specific coefficients look suspicously floating-point like, so is something like interval math used? Is it possible to also output a certificate, i.e. something like a 500 MB “execution trace” that can be fed into a simpler program, e.g. metamath or lean or coq for verification?

It is still common to have lengthy by-hand computations and estimates in papers; and people quickly get suspicious if one writes “and the right-hand side is nonnegative because solver XYZ said so”.

A much smaller frog to swallow is “solver XYZ constructed a certificate of nonnegativity; the sha1 of the certificate is xxxx, and it is available at yyyy, and can be verified with the simple and stable standard tool zzz”.

Another promising direction is

So, e.g. @dpsanders – would you, personally, say that a claim by IntervalArithmetic.jl has the same “real mathematical proofiness” as e.g. a lean-verified proof or the old four color theorem?

To give a very simple but real-life example (dynamical systems / general relativity).

I have 5 real variables, x y a b c. These have a,b,c > 0, and 1 = x^2+y^2 + a^2 + b^2 + c^2 - 2ab + +2bc + 2ca. Use the shorthand r^2 = x^2 + (b-c)^2 and d^2=4bc.

Show that, for y < 0, and a, |b+c|, |b-c| < 1/2, and r >0 and d/r < 1 the following holds:
| sqrt(3) * x * (b+c) / r^2 + (a - 2(b-c))*(1 - (b-c)^2 / ((1-y)*r^2))| < 5.

This kind of computation is a royal pain! There is no proof to possibly write down, it is obvious after staring at it and doing elementary estimates. All the interesting work is done in identifying the quantity that must be controlled, and the regime where it can be controlled, and using that to prove theorems.

And yet, it would be nice to also plug that kind of thing into a notebook that verifies that yes, this is correct.

Towards less simplistic things! A pretty cool construction used interval arithmetic plus conley index to verify chaoticity in the lorenz equation (ODE): [math/9501230] Chaos in the Lorenz equations: a computer-assisted proof and its companion

This should be a machine for cranking out theorems. But we don’t have the necessary support in DifferentialEquations.jl, do we @ChrisRackauckas ?

And even less simplistic. For example Computer-assisted proofs for semilinear elliptic boundary value problems | Japan Journal of Industrial and Applied Mathematics uses computer assisted proofs to construct solutions to e.g. - \Delta u = \lambda e^u with dirichlet boundary. This nonlinearity should make all of us shudder – obviously naive purely analytical techniques for existence of solutions will fail flat. And yet – obviously actual solutions will only have finite morse-index, so will be amenable to all our standard tools for approximation, especially discretization in a finite-dimensional function space. So it is feasible to take an approximate numerical solution, and then use some variant of Newton / Kantorovich / etc, plus a bunch of functional analysis, to prove a posteriori that a true continuous solution exists that is near the approximate numerical solution. Not just feasible, done!

Do we have the machinery in julia to replicate such things?

Ultimately, I’m coming at this more from the “want to prove theorem” side, not from the “what kind of theorem is feasible for computer-assistance”.I don’t really want to invent new computer-assisted proof techniques; I want to use the computer assistance to apply my favorite Conley index construction to specific things. I don’t really want to look into details of interval arithmetic implementations. I am pretty clueless about algebraic geometry, but it would be awesome if I could have a nice oracle “feed in description of semi-algebraic set, get certificate of emptiness | certificate of non-emptiness with a solution attached | nope, too difficult”. Many many elementary estimates can be formulated as “is this semi-algebraic set empty?”.


Less analysis and more algebra, you may have looked at OSCAR?

This reminds me of the following work on the Python side, by a friend of mine

So, I forgot to mention the most underused technology in this context: SAT-solvers. SAT is the quintessential NP-complete problem; but this is no excuse for not making a credible attempt. And solvers like e.g. miniSAT are pretty good.

The strategies that humans use for boolean SAT and estimation are pretty similar. If we have a system of equalities and inequalities, and want to show that no solution exists (the estimate encoded in that is true), then analogue techniques are:

  1. Branch / domain decomposition. Invent a new function f, and consider separately the cases f(x) <=0 and f(x)>=0 – in either case, the extra new inequality can hopefully permit additional simplifications. In SAT-solving this corresponds to taking a variable and checking the x_i == true and x_i == false cases separately. Of course this can make everything exponential, so it only makes sense if the extra assumptions permit significant simplifications!
  2. Auxilliary quantities / coordinate transformations. We can also introduce a new quantity y = f(x), i.e. add a variable and equation 0 == y - f(x). Together with domain decomposition and standard equality-based CAS techniques, this can discover appropriate coordinates for different regimes. SAT solvers also do this all the time.

The SAT people are pretty cool, in so far as they have specified formats and verifiers for “certificates of unsatisfiability” (example link). Have the “prove that semi-algebraic set is empty” people developed a similar culture?

I would feel much less dirty submitting a proof in a paper that ends in “and miniSAT claims that this system of constraints has no solution. qed.” if I could attach a machine-verifiable proof of the miniSAT claim.

A real example where I used miniSAT (via python) was when working on an algorithm for collapsing simplicial flag complexes. The algorithm used a mix of greedy simplification, with (exponential-time) branching; and at some point I wanted a list of minimal graphs where branching occurs. So, encode “this graph can’t be simplified by my greedy procedure”, ask miniSAT for all solutions, sort by isomorphism type and obtain a classification of the smallest interesting examples. Classifying these graphs turned out to be somewhat of a dead end – but it is cool that SAT solvers made it doable as a lowish-effort speculative attempt.

Downside of SAT solvers is that many domains don’t have good tooling and guidance for encoding problems. The learning curve for encoding problems in CNF is not entirely trivial, but it is absolutely not necessary to become a SAT expert to occasionally use such a solver.

But for that, it is also not clear to me whether PEPit outputs are intended to be “proof-grade”. Does “PEPit claims convergence rate xyz” constitute a computer-assisted proof of the claim, in the same way that “SAT solver claimed unsatisfiability or even generated machine-verifiable certificate of unsatisfiability” constitutes a computer-assisted proof? Maybe you can ask your friend over a beer?

I’m harking so much on that because the vast majority of numerical analysis concerns itself with simulation / error-guesstimates, and is pretty distant from the formal verification crowd. Philosophical difference: Sometimes we care about what is true, sometimes we care about what we can formally prove. These are very different things! (Goedel: not all true things are provable)

Ideally proofs are amenable to both human understanding and formal machine verification; most of the “formalization” program aims to make human-understood proofs machine-checkable, and I’m asking for low-effort machine-generated machine-checkable proofs where human understanding is infeasible or would be wasted.

1 Like

Are you looking for something similar to Flocq that works with Julia code?

Flocq (Floats for Coq) is a formalization of floating-point arithmetic for the Coq proof assistant. It provides a comprehensive library of theorems on a multi-radix multi-precision arithmetic. It also supports efficient numerical computations inside Coq.

Not quite. As far as I understand, Flocq concerns itself with formal verification of floating point algorithms.

To give a simple example: I have an explicit rational function in 1-d and I want to prove that it is positive on [0,1] (in fact, it happens to be 1.0 <= 1e3 in this interval, i.e. well bounded away from zero).

I can then ask IntervalArithmetic.jl to compute the image; that gives me a bound like e.g. “0.5 <= f(x) <= 2e3 for all 0 <= x <= 1”.

If the theory and implementation of IntervalArithmetic.jl is correct, then this computation constitutes a computer-assisted formal proof of the desired fact. Human understanding would be wasted on this desired but somewhat low-effort fact.

On the other hand, if I just plot f with a fine grid and eyeball the definition, then this would be numerical evidence, not proof.

Tools like Flocq come in when we don’t trust the human-understandable proofs of theory and implementation of IntervalArithmetic.jl, and want additional machine checking. That’s cool, but not a can of worms I want to open.

In an ideal world, there would exist standard formats and verifiers such that IntervalArithmetic.jl could output a certificate for its claim. A certificate is somewhat different from the computation: IntervalArithmetic.jl may need some attempts to find a good decomposition of the unit interval into small sub-intervals such that the desired claim follows. This is a search problem: There are many different good ways, and heuristics are very much welcome at that stage; the certificate could be the final interval decomposition.

But even without that, I’d trust the authors if they dare to proudly claim “output of IntervalArithmetic.jl are computer assisted proofs”.

The specific coefficients look suspicously floating-point like, so is something like interval math used? Is it possible to also output a certificate, i.e. something like a 500 MB “execution trace” that can be fed into a simpler program, e.g. metamath or lean or coq for verification?

You have a Sum-of-Squares decomposition (see SumOfSquares.sos_decomposiion) that certifies nonnegativity so that’s an actual proof. Indeed, it is floating point, you can use rounding methods to make it rigorous:

1 Like

I’ve used IntervalArithmetic for a paper, to bound a 1D function (f below), which in turn ends up giving a bound related to some particular random graphs:

Not sure if it worked; the paper was rejected from Journal of Applied Probability, and we never ended up submitting it somewhere else. Although the referee never mentioned the interval arithmetic; rather they made an inaccurate analogy to a balls-and-bins scenario and said “obviously” this doesn’t scale like root n, so the result must be wrong. (The paper also had a framing as something to do with epidemic modeling rather than just a statement about particular random graphs, which I didn’t particularly like).


That is pretty nice example!

And yet, the interval arithmetic is not load-bearing?

Without having seen the unpublished paper, it seems that your main result is more like “there exist x_0 >0 and \mu>0 such that … holds” which can be easily seen without computer assistance, and oh, by the way, here are explicit bounds on them.

That is a different beast than a paper where interval arithmetic gives some bound 0.3468 < c < 0.35 in a Lemma, and there would be no theorem at all (maybe even the entire approach would be doomed) if one could not prove c > 0.33.

If the exact numbers are not load-bearing, then I would not have raised eyebrows about the interval arithmetic as a referee either!

1 Like

They are the tiniest bit load-bearing, in that I didn’t bother to prove there is a unique max except via interval-arithmetic, although I’m sure it’s not too hard to do (I thought it was very cool that the interval-newton method could prove uniqueness so might as well use it :smile:). But I agree, here it’s mostly for knowing the value of the constants, so not the most compelling example. (Also the paper is on the arxiv if you do want to have a look; the appendix at the end has the interval arithmetic stuff).

edit: actually I’m not sure uniqueness really matters for the bound anyway, so maybe it’s not load-bearing at all.

Abstract interpretation (AI) for intervals might be relevant here. There’s a lot of AI papers proving various things about programs, for example Dynamic interval analysis by abstract interpretation. I know JET.jl has an abstract interpretation framework and I wonder if interval rules could be plugged into it.

Also GitHub - Keno/InterfaceSpecs.jl: Playground for formal specifications of interfaces in Julia was supposed to be able to prove things but it seems to not have really made it out of the gate yet.

I have worked on computer assisted proofs on analysis during my PhD, which I’m finishing in a few months, and have made use of Julia for all of my work. So I can share my experience at least!

I’m my case the computer assisted part has been in terms of “rigorous numerics”, essentially interval arithmetic. I don’t personally have any experience with the slightly different approach of using SAT solvers for verification.

My work has been in spectral geometry, i.e. studying solutions to -\Delta u = \lambda u, and fluid mechanics, where I have studied traveling waves. You can find my papers, and links to all the code, on my website I write a bit more about the software I have used below.

I know that the group surrounding Jean-Philippe also make use of Julia in some of their work. See e.g. for one example of a computer assisted proof with code given in Julia.


For interval arithmetic in Julia I’m aware of two main approaches. Either using IntervalArithmetic.jl, that you already mentioned, or using a wrapper of the Arb library (which since a year back is part of the Flint library). For the Arb library there are currently two wrappers, Nemo.jl (which is in turns wrapped in OSCAR.jl I believe) and Arblib.jl.

I have primarily made use of Arb in my work, mainly because I have needed a lot of special functions that Arb has good support for. Initially I used it through Nemo.jl, but since a few years back I have switched to Arblib.jl, which I’m a co-maintainer of.

If you are interested in seeing how Arblib.jl could be used in practice you could take a look at the code for my papers. At better starting point might however be the package ArbExtras.jl where I have implemented most of the generic algorithms that I make use of in my research. I have actually written a bit about this in the introduction to my PhD thesis, but it will be published first in May.

Correctness and formal proofs

Both Arb (and Arblib.jl) and IntervalArithmetic.jl are intended for use in computer assisted proofs, and you can find several published papers making use of them.

The Arb library is a well established, and, in my experience, extremely well tested library. Over the years I have only encountered a handful of bugs, most of them benign. The Arblib.jl package is a rather thin wrapper of Arb, so there is no so many opportunities to introduce bugs in that part (which of course doesn’t mean there are none).

For the IntervalArithmetic.jl package I have encountered a few bugs that would affect correctness, see e.g. #622 #620, #616 and #612. Most of them have however been quickly fixed and most of them are unlikely to have been a problem in practice.

Neither Arblib.jl or IntervalArithmetic.jl are however capable of producing formal computer assisted proofs, at least in the sense that I’m used to using the term formal proof. For me a formal proof is a proof implemented in a proof assistant (such as Coq or Lean), which has a very small kernel that checks all parts of the proof. As you mention it might be possible to produce some sort of certificate from Julia that can then be verified in a proof assistant. I think it is a very interesting approach and I have experimented a little bit with it, but I don’t have much experience.


I would definitely say that interval arithmetic provides “proofs” – provided you accept that the code is correct.

A simple example is as follows:

julia> using IntervalArithmetic, IntervalArithmetic.Symbols

julia> X = 3..Inf
[3.0, ∞)

julia> f(y) = y^2 - 2;

julia> f(X)
[7.0, ∞)

julia> in_interval(0.0, f(X))

This says “for any x ∈ [3, ∞), we have f(x) ∈ [7, ∞)”.
In particular, 0 is not in that output set, and so with this simple calculation we have proved that f has no root in the interval [7, ∞)!

We can extend this as follows:

julia> using IntervalRootFinding, IntervalArithmetic

julia> f(y) = y^2 - 2
f (generic function with 1 method)

julia> roots(f, -Inf..Inf)
2-element Vector{Root{Interval{Float64}}}:
 Root([-1.41422, -1.41421], :unique)
 Root([1.41421, 1.41422], :unique)

This proves that f has exactly 2 roots on the real line, and gives small intervals in which those roots are guaranteed to lie (and to be unique within those intervals).

This is done by exclusion using the above interval arithmetic technique, coupled with an interval version of the Newton method.

As far as I’m aware there are basically no other methods that are able to prove statements like this. Similarly intervals provide the only (?) method to prove that you have found the global minimum of a function over a given input set. (There are methods like sum of squares for polynomials, but not for general functions.)

Of course, intervals are not a magic bullet: in particular, they behave badly in higher dimensions since you often need to divide the input set into an exponentially large number of boxes in order to use these techniques.

[There seems to be some bad interaction between the latest releases of the above two packages. Hopefully this will be fixed soon.]


A good example of a proof using interval arithmetic is that of the Kepler conjecture by Thomas Hales. That required proving a lot of inequalities of the type that the OP describes.

Interval methods are suitable only for “numerical” problems. For more symbolic-type problems, such as “if y > x, show that y - x > 0”, you need an SMT solver such as Z3. There’s Julia interfaces in Satisfiability.jl and Z3.jl. There is indeed a lisp-like format for formulating problems for SMT solvers.

For SAT solving I made an initial interface in SatisfiabilityInterface.jl, but it still needs some (= a lot of) work – any help with this would be great!