[ANN] Metatheory.jl

Hi everybody! I’m happy to announce that the first version of Metatheory.jl is out!
You can find the code here GitHub - 0x0f0f0f/Metatheory.jl
And if you want, you can read the introductory preprint here [2102.07888] Metatheory.jl: Fast and Elegant Algebraic Computation in Julia with Extensible Equality Saturation
But what is Metatheory.jl?

“Metatheory.jl is a general purpose metaprogramming and algebraic computation library for the Julia programming language, designed to take advantage of the powerful reflection capabilities to bridge the gap between symbolic mathematics, abstract interpretation, equational reasoning, optimization, composable compiler transforms, and advanced homoiconic pattern matching features.”

The package will be available in the Julia registry in 3 days, the mandatory waiting period. If you’d like to try Metatheory.jl, you can install it from the Github repository, as described in the README

There are a lot of possible use cases for Metatheory.jl : symbolic mathematics, compiler optimization, composable compiler transforms, theorem proving, program analysis, category theoretic algebraic computing, programming language interpreters.
I would be really happy to hear any kind of feedback and proposals, and to experiment with you in possible applications and use cases of Metatheory.jl . Please note that this package is still in a very early stage and I would not recommend you to start using it in production packages.
I would recommend, if you are planning to play and experiment with this weird new package, to read a bit about term rewriting, homoiconicity and metaprogramming, and particularly e-graphs. You may read more about our EGraph implementation in this paper by Max Willsey, distinguished at POPL21 and in these blogposts 1 2 by @Philip Zucker .

Feel free to open topics in the Metatheory.jl Zulip Stream
If you’d like to help in contributing to the project, please check out the CONTRIBUTING.md guide. There are a few things that need some works: we can optimize and make Metatheory.jl much faster, and we can investigate and experiment to see how to integrate the e-graph rewriting backend with Julia’s subtyping system and type lattice. One of the most requested features, was specifying and matching rules on the type of variables (symbols), and not only on the type of literal values. Another requested feature, really interesting to investigate, is attaching logical statements to expressions in the e-graph. Let’s experiment together find an elegant and efficient way of having these two killer features!

35 Likes

This package looks amazing.

Julia is definitely a great fit for this kind of meta-programming framework. Also, your example that defines the semantic of a simple imperative language is remarkably simple and elegant.

I am looking forward to seeing more advanced applications, especially in the domain of performing advanced code optimizations.

Are you planning to give a talk about this at JuliaCon 2021?

7 Likes

I would really like to go for JuliaCon.

I am looking forward to seeing more advanced applications, especially in the domain of performing advanced code optimizations.

Have you seen the applications of the egg rust library? There’s optimization of floating point code for precision (herbie), faster neural networks (tensat), and more! It’s all something that is done moving back and forth between Racket, Rust, Python etc… My implementation is meant to do the same tasks without leaving Julia. Remember the two language problem?

5 Likes

Looks very interesting ! I would love to use this meta-programming approach and your library for automatic generation of probabilistic graphs.
But I guess it would require the equality saturation technique be extended to probabilistic graphs ? Or maybe come up with a fussy equality between joint distributions (like a threshold on the Kullback-Leibler Divergence) ?

If this is possible, it might allow efficient rewriting of graphs with stochastic variables.
For example, the two following graphs for generating mulivariate gaussian noise are equivalent in terms of distributions:

### Graph 1
e ~ MvNormal(μ,CC')
### Graph 2
n ~ MvNormal(0,I)
e = μ + C*n

Thank you for the introductory references ! I will definitely check the paper of Max Willsey and the blog posts you recommended.

Yes, I am aware of these applications and I was wondering if a package such as Metatheory.jl, with its nice integration to the Julia language, could make them more accessible and practical (and therefore more widely used).

It’s all about the integration I think. Any Herbie-like features I’d plan to use from within ModelingToolkit.jl, in which case plenty of users would be using it day one. Similarly SciML is interested in exploring a bunch of tensat-like things.

2 Likes

To optimise dagger graphs ? Cc @jpsamaroo

Sure, anything is possible :slight_smile: But doing so would incur significant compilation cost, is probably not a trivial implementation task, and would not necessarily have significant performance gains. It’s a similar argument to whether to include Metatheory.jl directly in Julia’s compiler; can you show consistent runtime performance improvements, while not imposing a significant compile-time cost?

Of course, you can try this as an external package that users can try out on their problems. I think that would be a very reasonable approach!

1 Like

FYI, I’ve just added extended API docs with DocStringExtensions.

https://0x0f0f0f.github.io/Metatheory.jl/dev/api/

2 Likes

After watching Allen School Colloquium: PLSE Research Group - YouTube, I realized that I was wrongly assuming that Metatheory was too niche for anything that I currently contribute to. So I’d like to amend my previous comment by pointing out the following potential area of exploration:

Dagger has an array interface, exposed to users as the Dagger.DArray. Operations on DArrays are lazy, similar to how transpose or adjoint are lazy in Julia. For example, doing A * B where A and B are both 2-dimensional DArrays, would produce a MatMul operator which lazily represents the matrix multiplication between A and B. When computed with Dagger, the operation is then actually executed to produce a result. Many such operations may be combined together, and there are various other operations defined for DArrays, like Transpose, Scale, Concat, Sort, etc. All of these operations are lazy, and can be easily introspected. The point I’m trying to make is, Dagger has a native graph representation that is potentially amenable to optimizations.

What I would find valuable is if someone could implement a simple and efficient way to optimize these sorts of graphs of array operations to reduce both complexity and runtime cost. If someone could present a small example of how to use Metatheory to optimize a few example graphs and show benchmarks of the performance improvements, I’d be happy to add it to Dagger’s docs as a recommendation for users to try out on their own Dagger graphs. Also, if the performance improvements were significant without the “compile-time” costs being too large, I’d be willing to consider adding a direct or indirect dependency on Metatheory in Dagger, so that it can used easily and possibly by default.

Of course, the reason I’m proposing this here instead of doing it myself is that I don’t really have the time or understanding of term rewriting at this moment to be able to implement this myself. However, if anyone has been meaning to learn more about Dagger, and do something fun along the way, I see this as a fun afternoon or weekend project that could end up helping people automatically optimize their own programs and calculations.

5 Likes

This is really interesting. I have a question about e-graphs. Is it like this?: the egraph represents in one graph all the possible rewrites of an expression (given a particular rule set).

Still, one has to choose one of the possibilities, that is apply a particular sequence of rules. This is in general an exponentially hard problem. How does the graph help tame the combinatorial explosion of paths?

1 Like

The egraph represents congruence of equivalence relations on terms. It is one of the most efficient (and compact) data structures for this. You can apply rules nondeterministically (at the same time), have advanced heuristics for speeding things up (EGraphs.Schedulers) and you can detect when there are no rewrites to apply anymore (satiration) and stop, or you can stop when you have proved something about expressions. This solves the problem of in what order rules are applied. You can have analyses from a semilattice domain (abstract interpretation) and have conditional/dynamic rules generated dynamically from the analyses data. (and allows for much more)

4 Likes