Interactive and mathematicaly sound code transformations

For quite some time I have been playing around with the idea what the next generation scientific computing language might look like and I would like to share my prototype. Mainly because Julia users are the kind of people I’m targeting and you might find it interesting.

In no way I’m here to compete with Julia, this is purely experimental thing but I would like to get some feedback, opinions and some ideas on what should I focus.

The library is written in Lean 4 which is a language that allows you to give precise mathematical reasoning to your code. You are able to treat any code purely symbolically and interactively transform it. The high level vision is to provide a computer algebra system on top of your code.

Best demonstrated by an example, simulation of harmonic oscillator:

We first define system’s energy:

def H (m k : ℝ) (x p : V) := (1/(2*m)) * ∥p∥² + k/2 * ∥x∥²

Them we define what program we want to create:

def solver (m k : ℝ) (steps : Nat)
  : Impl (ode_solve (HamiltonianSystem (H m k))) := ...

This states that we want to have an ode solver of a Hamiltonian system. One of the very interesting things is that code can be purely symbolic i.e. can’t be executed but it carries meaning. Both ode_solve and Hamiltonian system are non-computable function and you turn them to executable program via code transformation. The code transformation:

def solver (m k : ℝ) (steps : Nat)
  : Impl (ode_solve (HamiltonianSystem (H m k))) :=
  simp [HamiltonianSystem, H]

  rw [ode_solve_fixed_dt runge_kutta4_step]
  lift_limit steps "Number of ODE solver steps."; admit; simp

Going through the lines in the editor looks like this:

This roughly runs automatic differentiation to obtain Hamilton’s equations and approximates ode_solve with RK4.

A bit more complicated system n-body simulation:

def Coloumb (ε strength mass : ℝ) (x : ℝ^3) : ℝ := - strength * mass * ∥x∥^{-1,ε}
argument x [Fact (ε≠0)]
  isSmooth, diff, hasAdjDiff, adjDiff

def H (n : ℕ) (ε : ℝ) (G : ℝ) (m : Idx n → ℝ) (x p : (ℝ^3)^n) : ℝ :=
  (∑ i, (1/(2*m i)) * ∥p[i]∥²)
  ∑ i j,   Coloumb ε G (m i * m j) (x[i] - x[j])

argument p [Fact (n≠0)] [Fact (ε≠0)]
  isSmooth, diff, hasAdjDiff, adjDiff

argument x [Fact (n≠0)] [Fact (ε≠0)]
  isSmooth, diff, hasAdjDiff, 
  adjDiff by
    simp[H]; unfold hold; simp
    simp [sum_into_lambda]
    simp [← sum_of_add]

Currently, you need to annotate function’s arguments manually for the AD system to generate gradients.
Annotations isSmooth and hasAdjDiff automatically generate proofs that the function is indeed differentiable and has gradient(adjDiff = adjoint of differential). I separate differential and gradient as there are some mathematical issues when working with infinite dimensional vector spaces. Also, I’m using regularized Coulomb potential, so I need to assume that the regularization parameter is non-zero [Fact (ε≠0)], without it I would get a compiler error that Lean could not generate the proof of smoothness.
Annotations diff and adjDiff generate the code for differential and gradient. If you do not like automatically generated gradient you can jump in and apply some transformations manually, see adjDiff by .... The command simp[H]; unfold hold; simp is the standard AD command but the resulting code contains multiple sums that can be fused together. This is exactly done by the commands simp[sum_into_lambda]; simp [← sum_of_add]. And for example sum_into_lambda is a user defined theorem :

theorem sum_into_lambda {X Y ι} [Enumtype ι] [Vec Y]
  (f : ι → X → Y)
  : (∑ i, λ x => f i x) = (λ x => ∑ i, f i x)
  := sorry

Where sorry is the omitted proof of this theorem. You can skip proofs if you do not have the time to do it or you are confident enough that the theorem/rewrite rule is correct.

Fun experiment if you also include Lennard Jones potential:

The main point of using Lean is that you can easily state new transformation rules. Therefore bringing the code transformation right to the users fingertips. Furthermore, you can play around with these code transformations interactively, it is really easy to add/remove new rules and this tremendously simplifies designing and debugging code transformations.

One of the great strengths of using Lean is that even though I currently skip 99% of all proofs I still have tremendous confidence in my code. This is because Lean allows me to formally state what the code is supposed to be doing and this helps a lot at organizing and designing the code.

My current focus is to get symbolic/automatic differentiation working. Why another automatic differentiation system? I’m not there yet, but I believe I will be able to do symbolic/automatic differentiation with higher order functions. In particular, reverse mode AD with higher order functions and symbolic variational calculus and I’m not aware of any system capable of doing that, like deriving Euler-Lagrange equations by differentiating action integral.

The long term vision is to build a system where you state what kind of equation you want to solve and then get into a dialog with the computer on how to actually solve it. As the system understands mathematics I can imagine it might be able to analyze your problem and offer equivalent reformulations, appropriate discretizations etc.

This project is still in very early stage, but I would love to hear your thoughts. Maybe someone will be interested and courageous enough to try it out.


And the project github page: GitHub - lecopivo/SciLean: Framework for scientific computing written in Lean