Grassmann.jl A\b 3x faster than Julia's StaticArrays.jl

Yes, I’m well aware of all that. Having @pure does actually improve my code in a lot of places, it’s not something I can automatically remove from everywhere. I agree it is worth looking into how much the constant propagation has improved and to check if some of the pure annotations can be removed, but it’s not currently a high priority.

That may be a relevant factor but I think a much bigger and harder to avoid problem is that the method table itself is a global mutable state. Hence any function which relies on the state of the method table is itself impure.

That means no generic, multiple dispatch functions should be called inside a @pure function body, which is almost impossible in Julia.

This mutability in the method table may not effect you if you’re not hot-modifying it, however, it seems like it’s liable to cause weird, hard to reproduce bugs if other people were to try and build tools on top of Grassmann.jl, adding their own methods and overloading your functions (or if someone pirated the methods you use, or if you pirate someone else’s methods).

This seems especially worrisome since you do have a tenancy to advertise Grassmann.jl as something people should adapt their project to make use of, so if someone did act on that it could result in an unfortunate situation.

7 Likes

It is clearly stated in my repository that there is no warranty. Those who are serious about using Grassmann.jl should consider consulting with me.

Hum, too bad. I was really excited by your post, because I have to solve systems like A\b in two occasions: interpolating orbit and inside the Kalman filter (which applies to the attitude and orbit estimation inside the satellite). This performance gain would be very handy. However, given all the comments, maybe I will need to wait until those concerns are solved.

Anyway, great work! It is very interesting.

1 Like

@Ronis_BR, I don’t think you will encounter a major issue with the A\b method, would be interesting to see what you try out! Another interesting aspect would be using bivectors and quaternions for that also. As @Mason mentioned, what you probably don’t want to do is overload internal Grassmann methods.

1 Like

Could somebody please help me figure out what’s going on with this performance variance?

So what’s exactly going on here? Why is it faster when I paste it into the REPL?

This doesn’t happen with the Base.in or method, which is fast without pasting into the REPL

julia> @btime $(barycenter(A)) ∈ $A
  61.499 ns (0 allocations: 0 bytes)

So, for the linear solve method there is a performance gain from pasting into REPL but not for .

Can anyone help me figure out why the linear solve method changes in performance like this?

That sure seems like the sort of strangeness you might see from Base.@pure abuse; order of definition suddenly matters.

9 Likes

Whether or not @pure is used doesn’t affect this, you need to actually back up your claim with proof.

:man_shrugging: I’m not saying it with certainty, but I do have a mental model for how it might happen. Of course it might be something else, but personally I don’t have much motivation to look into strange performance of a package I know is intentionally in unsafe and quirky territory.

3 Likes

It’s certainly not intentionally unsafe, although it may be “quirky territory” and experimental.

Regarding the performance difference upon function redefinition, this issue is possibly related: https://github.com/JuliaLang/julia/issues/32552.

4 Likes

It seems easy enough to do a global find and replace to remove the @pure and see what happens to the benchmarks.

1 Like

Indeed, it’s easy to test and prove that @mbauman is wrong. If you remove all usage of @pure:

julia> @btime $A\$(v1+2v2+3v3+4v4+5v5)
  13.384 μs (150 allocations: 11.94 KiB)

And after pasting the Base.:\ method into the REPL for the second time

julia> @btime $A\$(v1+2v2+3v3+4v4+5v5)
  9.251 μs (146 allocations: 11.78 KiB)

As you can see, without the usage of @pure the same performance difference exists after pasting the method into the REPL, and also it is much slower without the @pure annotations.

It’s most likely that this is the same issue @schmrlng referenced above.

1 Like

Made another commit which generalizes the Base.in method above to embedded affine simplices in a higher dimensional manifold. This allows checking whether a 4 dimensional point is contained inside an embedded affine triangle in only 150ns compared to checking whether a point is inside a full dimensional simplex of the manifold with only 50ns. Would like to challenge other developers to try and make a faster method for determining if a 4 dimensional point is inside an affine triangle without Grassmann algebra. The algorithmic method based on Grassmann algebra for this is extremely fast.

Also, this generalization can be used to compute the gradient of an embedded hat function efficiently.

4 Likes

This was because of a special explicit case for dimensions 1 and 2 and 3, which are now accounted for in Grassmann.jl also.

Also, support has been added for Moore-Penrose inverses for underdetermined and overdetermined linear systems. For underdetermined cases, the exterior product algorithm works ~20x faster than the SMatrix algorithm, and it is numerically stable. For overdetermined equations, the method used is based on the traditional normal equations, and this is prone to more numerical instability.

Hence, underdetermined systems are faster and numerically stable with Grassmann.jl than StaticArrays.jl, while over determined systems have a fast method which isn’t quite as numerical stable.

julia> @btime vandermonde(Chain(0.2,0.3,0.4,0.5,0.6),Chain(0.2,0.234,0.315,0.465,0.7),ℝ3)
  151.526 ns (3 allocations: 432 bytes)
0.36140000000000005v₁ - 1.4604285714285716v₂ + 3.364285714285713v₃

In addition to that, a vandermonde feature as been added for Vandermonde constructors, to help make polynomial fitting very very fast. I have some more plans for polynomial fitting and linear algebra in the future, as well as other basis calculations.

Screenshot_2020-08-14_16-24-14

Also, if you load UnicodePlots then you can add a 4th argument with number of plot points to see.

5 Likes