[ANN] Symbolics.jl: A Modern Computer Algebra System for a Modern Language

Things did go wrong. Here’s a bit of the history so far. We started in DifferentialEquations.jl with a small macro, ParameterizedFunctions.jl. What if we could symbolically improve people’s model code?

This was first based in SymPy, which was too slow for the prototype to work out well. At the same time, the SymEngine release announcement popped up on the Julia Google Groups (before Discourse, or 2BD). SymEngine was just what we needed for a small bit of symbolics, and October 2016 that is where we were:

We were happy with that for awhile, but along came @Gaussia with improvement after improvement to DiffEqBio (now known as Catalyst.jl). It was close (but not exactly) copying similar code for symbolic improvements like Jacobians over to the @reaction_networks DSL:

and similarly Pumas.jl was released shortly after, built on ParameterizedFunctions.jl.

However, we started noticing everyone in the community was similarly duplicating work inside of DSLs: there was no symbolic target for DifferentialEquations.jl, so every package made their own. What if there was one fully-featured symbolic interface for “if you symbolically declared an ODE like this, then we all share the same transformations”. That became ModelingToolkit.jl. But as we expanded beyond “build Jacobian and simplify”, SymEngine wasn’t enough. It was missing too many features. But SymPy was too slow.

So we gave a try to Reduce:

That was okay, but still not hitting the SymEngine speeds. Soon after that was replaced by what you’d probably call version 1 of the pure Julia part of the CAS, now not just added higher level features but the full representation itself in Julia. @HarrisonGrodin was really instrumental in that.

It was at this time (2018!), that we first started realizing that (a) this is the right feature set for what we need but (b) we should probably be refactoring it around a core rewrite system. See some of:

and

This is where paper after paper on rule rewriting systems and CAS’s, the pros and cons of different approaches, etc. all started flying through the Slack. While Terms.jl never truly came into fruition as we had hoped, we still had a meager CAS inside of ModelingToolkit.jl which did what we needed so :+1: we were happy for a bit. ParameterizedFunctions.jl, DiffEqBio (now Catalyst.jl), Pumas, etc. were now parsers of DSLs generating MTK forms, coalescing onto one code base with one definition of Jacobian. Great!

Until we needed to scale a bit more, needed equation solvers, etc. We got @shashi as a PhD student in the Julia Lab and funding from the ARPA-E to develop this into a full Modelica-like system. For a small bit of details, see for example the Dymola compilation process:

@shashi, @YingboMa, and @Mason poured through the papers (and @shashi took Sussman’s courses on the topic, becoming a fan of the Structure and Interpretation of Classical Mechanics book :wink:), leading to a fully-featured rewrite system in SymbolicUtils.jl

@shashi’s release of SymbolicUtils.jl references tools like sicmutils as a big influence in its approaches.

At this point, ModelingToolkit was starting to have part of itself be a CAS, while the other big chunk was an equation-based modeling system. SymbolicUtils.jl was enough of CAS features that people wanted it to be a CAS (just look at how many Discourse articles there were on it), but of course all of the high level features like differentiation, build_function (the analogue to lambdify) were part of ModelingToolkit. So MTK’s docs started accommodating to two audiences: those who just want a CAS, and those who wanted equation-based modeling. But as it grew into a full Modelica competitor (which you will soon see has many new additions to go beyond Modelica :wink:, papers and tutorials coming), there was “no room” in the documentation.

So we set a vision, a plan for how to get there, already had a developer team and funding, starting seeking out additional funding sources, and finally split the CAS off placing it into an organization with SymbolicUtils.jl, forming JuliaSymbolics.

So we definitely had some bumps along the road and false starts, mostly because 5 years ago we didn’t know we were on the road to building a CAS. But over this time and after connecting with many libraries (some of which never got public code working well), and after digging through the literature, trying multiple rewrite systems, we are at something that works pretty well and that’s Symbolics.jl. It is fast, internally parallelized, and hits a lot of core features that people seem to want. It needs some more documentation and tutorials, and that’s mostly because it was confined to just being the first tutorial of ModelingToolkit for the last few years, and now it’s free to expand its tutorials in every topic a CAS covers.

And the improvements haven’t stopped there. In order to do staged programming and on-the-fly JIT properly, RuntimeGeneratedFunctions.jl was created as a solution:

And recently I’ve been talking with @0x0f0f0f a lot about MetaTheory.jl, which is an alternative rewriting system based on recent innovations called egraphs:

When I say recent, I mean POPL Distinguished Paper 2021. So how does this relate to more traditional rewrite approaches? You won’t find any papers on what happens if old rewrite system X is replaced by egg! In my mind there needs to be an organization building out a full CAS to research that question in full realistic scenarios. It needs to be in one of the high level languages where the researchers are working, and be fully-extendable with having its theory and rule system editable on the fly from the host language. And that’s where we are at with Symbolics.jl.

Given this journey it would be foolish of me to think we have found “our final form”, especially as we are evaluating other candidate rewrite systems to have under the hood! And I believe that a combination strategy will likely be the solution, using egraphs in some places while rewriters in others. But yes, we learned from mistakes overtime and now it’s pretty battle-hardened. At each step of the road we required it to solve all of the problems of the past, and do more. I think we are well past high level “read a paper” and by now deep into the weeds.

[Hopefully that history is just fun to read :grinning_face_with_smiling_eyes:]

98 Likes