Why hasn't LinearAlgebra been removed from the default sysimage?

Would it be possible to have a code example to illustrate type piracy and why LinearAlgebra shouldn’t be removed from the sysimage? I am not familiar with the concept and I am not sure to understand the comment that was linked. Thank you.

1 Like
julia> A, v = rand(5, 5), rand(5);

julia> @which A * v
*(A::Union{LinearAlgebra.Adjoint{<:Any, <:StridedMatrix{T}}, LinearAlgebra.Transpose{<:Any, <:StridedMatrix{T}}, StridedMatrix{T}}, x::StridedVector{S}) where {T<:Union{Float32, Float64, ComplexF64, ComplexF32}, S<:Real}
     @ LinearAlgebra ~/.julia/juliaup/julia-1.11.3+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/matmul.jl:53

“type piracy” means that some package adds a method to a function on types it does not own — in this case it is LinearAlgebra.jl adding the method (*)(A::AbstractMatrix{T}, x::AbstractVector{S}) when it does not own any of * AbstractMatrix or AbstractVector

“own” is arguably more of a social concept than a technical one, but it generally just means “the first place something is defined” and then other places have to explicitly import it to extend the method table

7 Likes

More succinctly, it means that the LinearAlgebra module is not neatly self-contained. Simply defining it (which is what stashing it in a sysimage does) creates methods that are callable without using or importing it — and they’re methods that people currently rely upon.

Avoiding piracy would mean that the function itself or its argument(s) would need to come from a name that you can access only by way of using LinearAlgebra (or an explicit LinearAlgebra.something).

7 Likes

I think that I have a solution to the LinearAlgebra problem…

From what I understand, two things are preventing LinearAlgebra from being removed from the Julia binary:

  1. We would prefer not to commit type Piracy on *, / and \
  2. Backwards compatibility

One could solve the first issue by making a package extension to Base that contains the definitions of the *, / and \ methods for the relevant Array types. These definitions could just call array_multiplication and array_division functions that are defined in either the LinearAlgebra package or in a new package e.g. HeavyArrayOperators. To solve the second issue, we could make a commandline option to the julia executable that would import LinearAlgebra or HeavyArrayOperators by default. This would mean that people that would like to opt out of these operators (and thereby all of blas) would be able to do so with a command line setting. Additionally, this would allow distribution of a “Julia Light” tgz file that would not include blas, LinearAlgebra and maybe other standard libraries as well.

3 Likes

Wouldn’t it be the same as declaring LinearAlgebra to be a package extension of Base and allow it to type piracy on Array ?

Sorry, I don’t understand what you mean (never heard of type privacy). I am proposing that a new package extension to Base be created that is loaded when LinearAlgebra is loaded (I have not looked at the code for this, but I am assuming that there is no such extension at present) and that the definitions of relevant operators be moved to this extension so that they are not accessible unless LinearAlgebra is loaded.

that’s still type piracy, with extra steps :slight_smile:

2 Likes

Yes I understand that you think of a BaseLinearAlgebraExt sub package, but I feel like it’s the same issue, an extension should not be allow to change/add methods on types it’s not owning

To me, an extension of Base is part of Base and therefore has the right to add methods when Base owns all the types, but if consensus is that an extension is an independent package with no such rights, then of course this is not a solution.

Using this logic, an extension to DataFrames would not be able to define a new method for Plots.plot on the DataFrame type without committing type piracy. This would severely limit the usefulness of extensions.

1 Like

The fundamental usability problem remains: Folks currently assume that a * b can do matrix multiplication without an explicit using LinearAlgebra and dependency in their Project.toml.

The problem isn’t the piracy per se, it’s the behavior that results.

Yes, in a normal situation an extension can be a good solution, because it’s only active when both packages are explicitly loaded. But that’s the very problem. LinearAlgebra is currently implicitly loaded.

13 Likes

Yes, currently it is implicitly loaded, but wouldn’t it be possible to separate the LinearAlgebra code out so that it is possible to start julia with something like:

julia --preloaded-modules=LinearAlgebra
#or
julia --preloaded-modules=

In other words, make “implicitly loading LinearAlgebra” the default option rather than the only option.

That is not true. Type piracy means that a package adds a method to a function it doesn’t own for types it doesn’t own. If the package owns the function or any of the types of the arguments, that’s not type piracy.

DataFrames owns DataFrame, so defining Plots.plot (which it doesn’t own) is not type piracy. As you point out, multiple dispatch wouldn’t make much sense if you couldn’t define new methods for Base (or other package) functions for new types.

With extension modules, there is no rule for which package should define extensions, i.e., whether a method for Plots.plot for DataFrame types should be an extension in the Plots or in the DataFrame package. This is something package maintainers have to agree on. There might be a heuristic that smaller packages should add extension modules for bigger / more general (“base”) packages, not the other way around. But that’s not a hard rule.

LinearAlgebra probably does have instances of type piracy w.r.t Base, so some parts of LinearAlgebra will have to be moved to Base to fix that.

8 Likes

That’s effectively the status quo. Your hypothetical --preloaded-modules isn’t much different from the current --sysimage switch. As noted in what’s now the first post in this split topic, you can build your own sysimage without LinearAlgebra or with arbitrary packages.

It’s the sysimage that defines the “preloaded modules.”

1 Like

I think one of the main motivations for excising LinearAlgebra is to remove OpenBLAS from the default sysimage. If you simply move the pirating methods over to Base you’ve now made OpenBLAS a dependency of Base, so that didn’t help. Some possible solutions are:

  • Put julia-native matmul and linsolve methods in Base, to be overwritten by BLAS-calling methods when LinearAlgebra is loaded
  • Add some kind of hook that autoloads LinearAlgebra when you call one of the relevant methods

I was under the impression that the folks working on stdlib excision have already decided on the path forward along one of these lines. Is that incorrect?

2 Likes

Method overwriting is disallowed for precompiled modules now. The fact that Base is necessarily loaded before anything else might make overwriting it an exception to the loading order problem, but it’s still going to cause a lot of cache invalidation between packages that use LinearAlgebra and those that don’t. Besides, people won’t be happy with using much less optimized matrix multiplication for just saving some code loading; nothing really makes up for the fact that v1 included a lot of math in Base.

How feasible is it to split off the parts of OpenBlas needed for Base e.g. *(::Matrix, ::Matrix)? Even if that was done magically, would it even save much?

1 Like

There’s a lot in LinearAlgebra that pirates Base, it would be more than just *, /, and \. Just subselecting from the imports at the top of LinearAlgebra.jl, these functions are almost certainly pirated for AbstractMatrix:

\, /, *, ^,
acos, acosh, acot, acoth, acsc, acsch, asec, asech,
asin, asinh, atan, atanh, cbrt, cis,
cos, cosh, cot, coth, csc, csch, exp
inv,
log
sec, sech,
sin, sincos, sinh, sqrt, tan, tanh,

To be honest, it’s a shorter list than I would have expected. I guess other flavors of these methods (like sinpi or exp2) probably just work from generic fallbacks, which is nice. But there are some obvious absences like muladd so some others must be hiding in the weeds.

Although functions like +, -, and conj on AbstractArray are defined in Base, I would contend that they only “make sense” in a linear-algebraic sense and “ought to” only be defined in LinearAlgebra (broadcast is more appropriate for general array use).

For some reason, isone(::AbstractMatrix) is pirated in LinearAlgebra but the AbstractMatrix methods for one, oneunit, zero, iszero are all defined in Base. The one-flavored versions probably all belong in LinearAlgebra, since it doesn’t mean anything for a N-dim array to be “one.”

4 Likes

I want to emphasize the solution I’ve been writing into the Github issues around this. IMO we’re taking for granted a set of premises that then contradict:

  1. We need to have a*b in Base.
  2. a*b is defined using BLAS
  3. We don’t want to always load BLAS
  4. We want to always load the method for a*b

Those are simply an incompatible set of assumptions. Which one needs to go?

We keep on arguing about maybe changing (1), (3), or (4). In fact, not having (3) was the old “solution”, but led to large load times. So we used to always load BLAS, but now we’re looking how to sneakily load BLAS on demand.

Here’s a question, can we consider getting rid of assumption (2): not define a*b using BLAS by default?

Here’s how that could look. We could have Base Julia ship with a really small shim for a BLAS that is just dead simple. 3 line matrix multiplication. By putting this as a shim BLAS, we can still setup those same functions with libblastrampoline. Then, we could change LinearAlgebra.BLAS to an OpenBLAS module, where using OpenBLAS triggers the libblastrampoline changes. Note that this would not require BLAS calling methods to be overwritten when BLAS is loaded, since this just means that the implementation of a*b is using LBT, and we just ship a simpler LBT backend.

This wouldn’t handle the other matrix functions, but it would handle the biggest and baddest of cases. It would have a nice side effect too, since then using OpenBLAS would be on the same footing as using MKL and any other BLAS. This is something I would really like because OpenBLAS is not always a good choice, so moving it away from being a default is ultimately good for performance. For legacy purposes, we can make using LinearAlgebra also using OpenBLAS; const BLAS = OpenBLAS in the module so that nothing is breaking.

Finally, if we really feel strongly about users potentially using the “wrong” BLAS, i.e. the simple BLAS shim, we could by default throw a warning

Warning: Default BLAS usage detected. This results in a slow operation for this function. It is recommended that you load a BLAS for your given platform. For example, using AppleAccelerate is recommended on M1 Macs, using MKL is recommended on Intel CPUs, and using OpenBLAS is recommended on other platforms. See doc page xxxxxxxx for details. If the default BLAS is being used on purpose, set Base.warn_blas[] = false to turn off this warning.

the first call that uses the shim. This would both improve the performance for many users, since instead of just sticking them with OpenBLAS we will now be showing them the right BLAS most of the time, and it would be very clear to anyone that needs performance.

As a nice aside, this would also likely make binary builds a bit easier, since then BLAS is not used by default, and you could do a binary build that uses the 3-loop matmul shim if you care about size more than performance (which many controls applications with small matrices would likely prefer).

So together… I think this is the right trade-off.

26 Likes

Wouldn’t it be cleaner (and easier to excise LinearAlgebra) if “arrays” and “vectors/matrices” were different objects…
A Julia AbstractVector is often not a mathematical vector, and mathematical vectors aren’t necessarily Julia AbstractVector s.

Of course changing that would be breaking now.

6 Likes

But to avoid breakage, you would need a solution for these somehow, and if your solution is a native BLAS I can’t see an alternative to covering all the BLAS/LAPACK functions that are reachable without loading LinearAlgebra. You can’t simplify the interface to, for example, only provide the LU factorization, even if that were technically sufficient to implement the required matrix functions, because you’re plugging in the shim at a lower level where you don’t get to decide which algorithms are used.

1 Like

Those other matrix functions are not BLAS/LAPACK functions but native Julia functions IIRC.

1 Like