Linnea: automatic generation of efficient linear algebra programs

Came across this interesting article about Linnea and would like to share it.

The authors claim that the code Linnea generates almost always outperforms the recommended implementations in Matlab, Julia, Eigen and Armadillo, at least on the specific linear algebra problems tested.
The authors also say that Linnea generates Julia code yet, it seems to be now available in Python?

Does anyone know Linnea and could they shed some light?

7 Likes

There is an online form for Julia code generation:
https://linnea.cs.umu.se/

Input a linear algebra expression to generate Julia code that computes it. The following operations are allowed: Multiplication (*), addition (+), subtraction (-), transposition (trans(X)), and inversion (inv(X)).

How to read exactly the diagrams in the article? y-axis: “Speedup of Linnea”.
Julia is often the one which profits the most?

2 Likes

Must distinguish between “Julia naive” and “Julia recommended” (by whom?) implementations.

Bottom line might be that: the authors decided to generate Julia code because it offers a good tradeoff between simplicity and performance.

Interesting, but since it outputs Julia code it would more interesting if it used the shiny new Julia Symbolics / Modeling toolkit rather then Python for all the symbolic manipulation…

1 Like

We could just implement this algorithm into Symbolics.jl. It’s worth opening an issue.

I’m happy to hear that you are interested in Linnea (I’m the main developer).

In general, I think that making Linnea’s optimizations available to Julia is a very interesting idea. There are some things to consider:

  • One of the major problems that I see for the integration into Julia (and other languages) is that Linnea requires the sizes of the matrices and vectors to be known at compile-time. I have some ideas for how this can be changed, but they require quite some research.
  • Linnea consists of a bit more than 15k lines of code, and it depends on a Python package called MatchPy which consists of another 10k lines of code. I expect that reimplementing all of Linnea in Julia takes quite some time and effort. Of course, it would be possible to start with something simpler and only implement a subset of the optimizations used by Linnea.
  • There are some people looking into reimplementing Linnea in the LLVM/MLIR infrastructure. In case this effort is successful, Julia should be able to make use of Linnea’s optimizations. I will ping those people and ask them if they want to join this discussion.

To answer your questions:

Figure 9 shows the speedup of the code generated by Linnea over the other languages and libraries. If there is a solid red triangle with a y-value of 10, it means that the Julia code generated by Linnea outperforms the naive Julia implementation by a factor of 10 for that specific example problem.

The difference between the naive and recommended implementation is the treatment of linear systems. In the naive implementation, a linear system A^(-1)*B is implemented as inv(A)*B, in the recommended implementation, it is implemented as A\B. (At least the Matlab documentation points out that A\B should be used instead of inv(A)*B. That’s why we call it “recommended”.)

We are using both because it’s not entirely clear what a fair comparison is in this case: The naive implementation is closer to the mathematical description of the problem, and closer to the input of Linnea. On the other hand, it’s well known that inv(A)*B is bad both in terms of numerical stability and performance, so we also want to compare with A\B.

3 Likes

Thanks @henrik227 for your info!

Linnea consists of a bit more than 15k lines of code, and it depends on a Python package called MatchPy which consists of another 10k lines of code.

Julia happily works with Python, see PyPlot.jl, could reduce integration work.

Hi! Very cool work and thanks for dropping by.

It seems like the Julia symbolics ecosystem, and particularly https://github.com/JuliaSymbolics/Metatheory.jl by @0x0f0f0f provides a superset of matchpy, along with more powerful tools for representing and transforming expressions such as egraphs and compile time metaprogramming…so at the very least we don’t need to worry about replicating matchpy.

In fact, there’s also a lot of interest in putting linear algebra optimizations into Julia’s compilation pipeline. There are ongoing efforts to make this pipeline more modular, so that third party packages can easily plug in as first class citizens.

Using https://github.com/JuliaArrays/StaticArrays.jl one can track array sizes at compile time, then access either the SSA IR or AST of the code, throw that into a data structure, manipulate with the julia symbolics ecosystem and generate new optimized code.

Edit:

Another option is to use the symbolic arrays, which track shapes and iteration using an einsum type notation: Symbolic arrays · Symbolics.jl

3 Likes

Good thing all of this is provided by Symbolics.jl, even the sizing of the symbolic arrays.

2 Likes

If anyone of you is interested in learning more about Linnea, you can also take a look at my dissertation.

Yes, this is probably the easiest solution and a good starting point. @andreasnoack implemented a prototype of such an interface a couple of years ago when I was visiting MIT. Unfortunately, I don’t seem to be able to find the code anymore. However, both Linnea and Julia changed a lot since then, so I’m not sure if that code would still be useful anyway.
As mentioned by Paolo in his mail, it’s a question of how much time and resources one is willing to invest. A native Julia implementation should be much faster, but it’s also more work.

Good to know! As a side note: It should be fairly straightforward to implement a significant portion of what Linnea does with equality saturation. It’s not clear to me how one could directly make use of matrix factorizations, but perhaps that’s an acceptable tradeoff. With “directly making use of factorizations”, I mean what is described in section 5.7 of my dissertation (see link above).
It would be interesting to see how equality saturation compares to Linnea in terms of generation time and quality of the generated code.

Maybe I’m missing something, but I don’t see how a package like Symbolics.jl can solve this problem. The fact that Linnea only supports fixed operand sizes is not an issue of the internal representation of expressions; it’s a fundamental limitation of the algorithms. Given a linear algebra problem/expression, Linnea generates code that is optimal for specific operand sizes. The same code might be vastly suboptimal for any other sizes. Without any changes to the algorithms used by Linnea, it would be necessary to (re-)generate the code whenever the operand sizes change. To get an idea of the issue, you could take a look at the matrix chain problem: The optimal parenthesization depends on the matrix sizes.
In section 11.1.1 of my dissertation (see link above), I discuss a bit what would be necessary to support variable operand sizes in a meaningful way.

3 Likes

Yes… that’s why I mentioned Symbolics.jl because it’s operand sizes are fixed and has a size propagation algorithm internal to it.

https://symbolics.juliasymbolics.org/dev/manual/arrays/#Symbolic-arrays-1

Ah, I see. Sorry for the misunderstanding.

Another potential approach is to use laziness to gain some similar advantages to Linnea. I had a proof-of-concept (posted previously; apologies for self-promotion) that lazily recognizes inv(A'A)*A'b as a pseudo-inverse, similar to Fig 3 of the Linnea article. Except instead of a graph, it stores info such as positive definiteness and matrix solve in types, and then tries to do the right thing when the operations resolve into an expected pattern. Re-posting from before,

julia> using LazyPseudoinverse
julia> A = [1 2 3; 4 5 6; 7. 8 10; 11 13 19];
julia> b = [5, 8, 10, 11.];
julia> inv(A'A)A'b # looks like normal matrix operation
3-element Vector{Float64}:
 -4.679738562091531
  8.222222222222245
 -2.3333333333333317

julia> inv(A'A)A'b == A\b # it's really a QR solve!
true

If the pattern ended up as inv(A'A)C, it would still recognize positive definiteness and use Cholesky factorization. In principle, any use of inv could be done lazily and automatically do \ when right-multiplied by something. The lazy approach is probably very limited in what it can do compared to Metatheory.jl or Linnea, but for simple things it is quite straightforward and doesn’t need to know the sizes in advance.

1 Like

Interesting, I was not aware that something like this is possible in Julia.