[ANN] Symbolics.jl: A Modern Computer Algebra System for a Modern Language

That’s great to hear! As a user, I very much like that I can just fire up OpenModelica/OMEdit, connect a couple of tanks, pipes and pumps, a PID controller, and a state machine together (in a GUI, too) to implement a multi-domain model for an industrial process, where I “just” have to deal with defining the correct (physical) parameters for the involved components.

Sure. I was more thinking that an excited reader (that doesn’t lurk here :sweat_smile:) has to start searching online, or browse referenced papers, instead of clicking a footnote in the paper pointing to mtk.sciml.ai/
Maybe that’s a pet peeve of mine, though – papers on some implementation/package/library, sometimes old and/or obscure, without a link to the source.

Re the code, I was thinking less of the toy/demo problem, and more of the more “realistic” big HVAC model, which surely has more detailed models, probably thermodynamics, maybe media models, and is benchmarked against an Modelica implementation. I thought that those models would be made available, e.g. for the OpenModelica folks to take a stab at and benchmark from “the other side”. Considering the danger of less representative benchmarks in general (especially across tool/lang boundaries), that would surely strengthens the case for MTK, no?

Afaict, OpenModelica’s Modelica compiler is already mainly (fully?) implemented in this (bootstrapped, too). Also, I’m not sure which community is lost here, the Modelica (the language) community is not that small, either, and heavily backed by industry as far as I can tell.
Surely, MetaModelica does not “gain” the Julia community, though, but that is not the aim of MetaModelica. This was more meant as a reference to a similar embedding effort in Modelica. :man_shrugging:

Also, you’re surely correct concerning the wider scope of ML tooling, compatibility with many Julia packages, etc. that MTK enables!

Anyway, I’m glad to see all this! In the end, as a user, I can only win if there is more choice! :smiley:

3 Likes

@JeffreySarnoff seems to have given it a shot, but it resulted in an error in the Threading testset:

   Evaluated: isequal(a*((1 + d)^-1) + (0.6666666666666666 + a)*(c^-1) + (b + (b^2))*(d^-1) + 2.0d*((1 + c)^-1)*(((c^-1) + (d^-1))^-1), a*((1 + d)^-1) + (0.6666666666666666 + a)*(c^-1) + b*(1 + b)*(d^-1) + 2.0d*((1 + c)^-1)*(((c^-1) + (d^-1))^-1))

Some adjustment is needed here, but it’s not clear what.

1 Like

I think a simpler explanation of AD as alternate interpretation for the same abstract program text, or as compiling the text is given in this paper of mine:

… assuming you are willing to see AD in utmost simplicity in the context of Lisp.
I think the same framework (using a package to overload the usual operations) could be used for sparsity detection, but I have not looked at this in any detail.

2 Likes

You have reconstructed my foray correctly. The only alteration I had made to the PR branch was to change the default for that parameter from false to true … and with that change, the function matched its docstring.

Upon noticing a test had failed, and determining that the failure more likely than not had occurred as a consequence of my edit – I let go of the PR. Wholly unfamiliar with Symbolics.jl (and totally unaware of how it tests), changing the default state of a boolean flag to match the docs seemed without danger. It could be that the inline doc was in error, not the setting (then there would be some other error that fails when you tried to play). If that is not the case, then some other edit has been overlooked or underbaked – either way, to resolve this requires the (possibly brief) attention of someone with project local expertise.

:surfing_man:t3:

4 Likes

Not only could it be used for that, we explain in the blog post how to do it, and it’s a feature of Symbolics.jl:

https://symbolics.juliasymbolics.org/dev/manual/sparsity_detection/

And with the direct tracing it can produce sparsity patterns from Julia code.

I’ll get Shashi to take a look. He’s taking a day off from the paper completion.

7 Likes

It is possible that the documentation erroneously says that polynorm=true is the default, while in fact polynorm=false is the default. If some of the tools depending on Symbolics has used the default argument value ("false"), they may possibly fail if the default value of the simplify function is changed

2 Likes

I enjoyed reading the paper! Question: is there some provision for taking out the ModelingToolkit.jl documentation as a pdf file to read in a coffee shop, instead of viewing it on screen?

2 Likes

Not yet. Open an issue in Documenter.

That’s a @shashi question. Open an issue on the repo, though after cramming for the paper we’ll need a day or two.

4 Likes

Speaking about Groebner bases implementation, there is a new joint effort: msolve (Multivariate Polynomial Systems - msolve library). One of the authors is the author of this GroebnerBasis package in Julia: GitHub - ederc/GroebnerBasis.jl: Julia wrapper for gb
To the best of my knowledge, the authors consider having a Julia interface as one of the goal for the near future.

5 Likes

Doesn’t Documenter.jl have a pdf backend you could use to compile the docs to pdf locally?

2 Likes

I never found out how to make it work, but if you know how, please please help!

3 Likes

BTW, there was some earlier speculation in this thread about egraph techniques. This is now realized with MetaTheory.jl:

It can now use the Symbolics.jl expression formulation directly in egraphs, which means the asymptotically good egg rule expansions are available. That right there is something to be excited about!

10 Likes

Wait so let me make sure I understand this correctly.

  1. Symbolics.jl is a CAS.
  2. Metatheory.jl can expand terms given by this CAS under some rewrite rules into all possible equivalent terms.
  3. We can now extract the smallest/fastest/most efficient term under some metric of all equivalent terms. We thus receive the most efficient way to represent the symbolic calculation, since that’s what we started out with. (Possibly for generic julia code, since Metatheory.jl seems to be able to work with Expr)

Did I get this right?

4 Likes

Yes basically. Symbolics.jl uses by default SymbolicUtils.jl which is just a fast rewrite system. E-graphs and egg, which is what MetaTheory.jl uses, are asymptotically better. But timings from the Zulip:

shows that it has more overhead on small problems. With this connection, we can seamlessly switch between the two approaches, giving asymptotically good rule applications when expressions get big. So big simplify expressions, big integrals, etc. That should be a fun topic to explore!

12 Likes

Hi all,

Speaking as current maintainer of sympy I want to say firstly that I wish you the best of luck!

While I really like sympy and use it all the time it can of course be better. I want sympy to be better and I hope that Symbolics.jl can be better as well.

I am only recently learning Julia but I will certainly try out Symbolics.jl. Perhaps I will contribute in some way in future but as I say I’m a beginner in Julia right now.

I have many more things I could say but having skimmed this extremely long thread I want to make one point clear. There are a number of comments above about sympy being slow (which it is for some things) and the implication seems to be that sympy is slow because it is in Python rather than being implemented in e.g. Julia. That point is mostly incorrect:

  1. Usually when sympy is slow it is because the implementation in sympy is unoptimised or the best algorithms are not being used.

  2. Many of the difficulties in improving performance in sympy stem from early design decisions about very basic behaviour like automatic evaluation or the use of assumptions.

  3. Sympy could be made much faster while still working in Python and there are often (significant) performance improvements but it’s harder to implement these once you have many users and need to consider backwards compatibility etc.

Julia and Symbolics.jl do not need to worry about point 3) so much (yet) which helps a lot!

Addressing point 1) it is a huge amount of work and I do not see any evidence yet that it would be any easier in Julia than Python. A language speed difference of say 100x doesn’t help much when your algorithm has O(2^{x}!) complexity! You have to remember that as a CAS grows most contributors will not be able to understand the performance implications of even “elementary” operations. Few people will ever understand enough of the internals of any CAS to be able to actually say what the algorithmic cost of an operation like expand is. When few people understand the performance of low-level operations it’s hard for anyone to build efficient high-level operations.

Most important though is point 2). Seemingly reasonable design decisions made at an early stage (perhaps already made) can have a huge impact down the line and can be much harder to change by the time the problems are understood. If Symbolics.jl is hoped to be a long-lived and successful CAS to rival or exceed sympy and others then it is hard to overstate the importance of learning lessons from the past. (I haven’t actually looked at the Symbolics.jl package yet but I don’t see significant discussion of the difficulties faced by other CAS in the thread above.) Most importantly learning those lessons does not mean copying the behaviour of existing CAS!

Do not dismiss sympy’s difficulties as being related to Python or you risk repeating the same mistakes as CAS of the past. There are many good things about sympy and other CAS but overcoming their limitations requires a deep understanding to solve the combined problem of:

  1. Algorithmic symbolic manipulation
  2. Designing intelligible software
  3. Managing a huge long-term development project

Point 3. is the one that can be most limiting in the long term.

Oscar

85 Likes

Thank you @oscarbenjamin !

1 Like

Very interesting. Can you guide me to literature about why SCAS need more than real and complex numbers? I also very interested about your experience about bridging the gaps between SCASs and MCASs.

First, many methods that deal with algebraic equations, symoblic integration, etc. need to perform computations over finite fields, algebraic numbers, p-adic numbers, etc.

Second, cryptography, which is I’d say somewhere on the border of science and engineering, uses most of the above.

Third, systems of inequalities and equations (and optimisation problems with these constraints) scientists and engineers often encouner need to be solved over integers, not over reals or complex numbers. Integer lineal programming is a huge applied branch, important in industry (think about e.g. scheduling).

Fourth, precision problems quite often force one to switch to precision higher than that provided by one’s CPU/GPU.

7 Likes

strictly speaking Sage does not require a VM to be run on Windows - you can get it running in Cygwin, which is a POSIX emulator (as some components of Sage, e.g. GAP, use fork(), mmap(), etc). And another option is using WSL2, which is not quite a VM.

I have been directed to this thread - very interesting indeed.

So, first, I should introduce myself: I worked on Maple for 11 years. I was head of the Math Group for much of that time, and then loftier titles for a few years, before going back to academia because I wanted to do really explore the foundations and semantics of CAS “from the ground up”. I’ve been mostly doing meta-programming and digging deeper into theorem proving (I drifted from Maple → OCaml → {Haskell, metaocaml} → Agda over the years). But I also went back to using Maple for a while for the purposes of the probabilistic programming language Hakaru that I worked on for ~5 years.

Lately I’ve been working on leveraging ideas in Universal Algebra as a way to generate theory libraries (and code and …) that shouldn’t be written by humans.

Do I have opinions about CAS? Indeed, tons. I’ve written some tweet storms about some of them.

But first, I should read more about the design of Symbolics.jl. The things that are ‘unchangeable’ about it. It would be entirely pointless for me to come in and tell you that all of that is rubish (even if I it turned out that that was my opinion). If I want to be helpful (and that why I’m here), I should at least give you advice on past mistakes that you can avoid, rather than pontificating on random stuff.

So: what should I read to inform myself?

PS: Risch is a horrible way to do integration. Indefinite integration is not nearly as useful as it’s made out to be. Applications tend to want definite integration, even if it’s from some constant to some arbitrary point ‘x’. That is indeed subtly different. The interesting problem in integration is definite integration, which is a problem of analysis, not one of algebra. The way the integration problem is turned into algebra fundamentally erases many of the things that make integration interesting (singularities). FToC tends to be an especially hard way to compute definite integrals, as the limit problems generated thusly tend to be harder than the original question!

59 Likes