Minimal Julia: What do you want in Julia, or not?

My UPX experiment wasn’t too successful, if I recall, so I didn’t suggest using it for Julia (even better then compressing is just eliminating). I forget the details, they’re likely here in the thread. If it had helped, or if you I’m wrong, and you get it to work, then please make a PR. I don’t know about “strip”, maybe at least add a PR for that?

I’ve gotten rid of LinearAlgebra privately, it’s not to complex, but it needs NOT be a breaking change (same API), only e.g. multiplying matrices need to go to a slower matmul, until using LinearAlgebra. Moving to a new repo was I think only for code organization, and to have issues relating to it tracked there. Though might be first stop to get out of the sysimage.

I want to get rid of [Open]BLAS. And more e.g. MPFR I’m looking at now (there are better/faster alternatives, and none of BigFloat and BigInt need to be in Base, IMHO).

I’ve also found a much faster regex library (at least for some edge-cases, PCRE2, should also go).

1 Like

That’s all highly speculative. One concrete path forward here is the upcoming v1.12’s --trim.

3 Likes

I manage to remove LinearAlgebra from the sysimage. It reduces the size of the julia installation folder and seems to improve the startup time. LinearAlgebra can still be used as normal, so I am not sure why it has not been removed from the sysimage yet?

Regarding upx, it did not work for me on windows, however “strip” worked well on all the dll in the bin folder as well as the sysimage.dll and haven’t notice any negative effect. I was wondering if anyone else tested that?

In total I manage to reduce Julia 1.13 size down from 1.1G to 850M.

3 Likes

I think Piracy in the StdLibs · Issue #30945 · JuliaLang/julia · GitHub is a good tl;dr

4 Likes

And if you delete all *_T3QLl.dll from share\julia\compiled\v1.13 like I mentioned in Julia installation file sizes - #10 by gitboy16, you shave another ~190 Mb. Also, apparently without any negative effect.

2 Likes

This looks rather interesting, I wonder how close to PRCE-2 compatible it is…

3 Likes

11 posts were split to a new topic: Why hasn’t LinearAlgebra been removed from the default sysimage?

I am clearly not in the right league to comment on the technical aspects in this thread, but I feel like the perspective of an average (or below average!) user is missing. Please, do not turn Julia into another Python, where basic mathematical functions require importing additional packages (yes, this includes linear algebra in my view). For instance, I already find it odd and somewhat frustrating that I have to use using Statistics just to access the mean function. Maybe that’s just me, maybe not…

15 Likes

I echo this. I encounter this issue daily! Every time I start a new script/notebook, I forgot to import Statistics to just use mean! Not a big deal maybe, but really a bad experience!

I remember I have expressed my strong opposition to remove Statistics especially mean from Base here in discourse but in vein, sigh~

3 Likes

I go back and forth on this all the time. Like, yes having an R like statistical standard library would make the transfer to Julia super easy. But also there is a lot to be said about trimming the fat and keeping statistics out of the environment unless they are needed. My main piece of advice on this front is “if your code is internal to you, and it bothers you, add Statistics to your startup file.” Outside of that its just one of those things that can be annoying.

Edit from 3/11/25: Although, as I’ve been working on Dendrochronology.jl I learned Julia is missing some fundamental time-series statistics that come in r::stats. I’m making a hacky solution rn, might try to integrate into TimeSeries.jl or AutoRegressions.jl

4 Likes

Needing to import Statistics/StatsBase for stuff like mean or countmap used to trip me up a lot when I wanted to “just” do something simple in the REPL.

You can add this to your global environment/startup, but then it’s really easy to share code/packages that use one of the functions that you forgot isn’t part of the base language.

4 Likes

When discussing what to keep in Base and what to have in packages, I think it’s important to realize how differently people use programming, even in science.
There is a risk that Julia can cater too much to a niche audience, which causes Julia users to be the self-selected group who cares about those features, which in turn feeds the idea that “everyone wants this feature”. Right now, Julia very much caters to the “MATLAB” audience who works a lot with arrays of floats - and I happen to think that this self-selection process has already happened, and it’s detrimental to the language. You will notice that when Julia comes up in discussions outside the community: People talk about it as a domain-specific language for numerical computing and not a general purpose language.

For example, I don’t really care about multidimensional arrays and linear algebra at all. I find it totally unnecessary that we have QR factorization in the system image - I went to university and don’t even know what that means. I’m never going to use it. To me, that’s a great example of something niche that absolutely belongs in a package.

On the other hand, I think the functionality provided by the StringViews, BufferedStreams and MemoryViews packages are much more basic and really do belong in Base.

So who is right? None of us, of course. Both camps have this idea that the functions we happen to want to be in base is not that much and won’t bloat Julia (although BLAS in particular really does its share of bloating…). Whereas the truth is that Julia becomes enourmous if it included all the “commonly used functionality” that people expect to be in the system image - because it’s all different stuff for different people.

The only viable approach is to have a much smaller Base without any domain specific stuff. In my opinion, Base should only contain general “computer sciency” stuff like Dicts and Vectors and sorting and such, as well as some very basic abstract types/interfaces like AbstractArray and AbstractString. Using that, people can build everything else as a package.

35 Likes

I agree with you, also, if someone want’s to make a close source science julia at some points (as matlab is for python/fortran ect) nothing stops them from doing so.
And if someone wants all that in julia, they could still make a sysimage with their favorite packages and use that.
I’m not sure we should get rid of multi-dim arrays though, they can be usefull for newcommers (avoid index mindblow in some cases) and they are particularly well design in julia.
The question for *(AbtractArray, AbtractArray) and \(AbtractArray, AbtractArray) is still hard but we could just make it a method error and a little tip to install LinearAlgebra (hardly breacking but not that annoying).
Also, we could distinguish between package that are allowed to break type-privacy and others without having them in the sysimage, they would be more language-addon instead of packages.
I’m not sure the language design is the only thing that makes julia kinda “niche” though but that’s another thread for another time.

3 Likes

I think this was not about removing multi-dimensional arrays, but rather the plethora of functions that work only on numerical tensors, but have no generic use otherwise.

Maybe the numerical focus was good in the beginning to have some clear differentiator, so it made sense to include all that stuff. But now that we have apps, binary compilation, more need for tooling, I think it becomes clear that being able to run Julia with the least bloat possible is important for the future development of the ecosystem.

5 Likes

yes that shouldn’t be inside julia sysimage at all I agree, especially with sysimage now, but allowing non-type-private packages in really specific cases shouldn’t be that scary, should it ? Also, having method error on * for AbstractArray wouldn’t be a problem at all for the same reason +(String,String) is not a problem right now.

An example I like a lot is python / sagemath. There are basically two modes you can run in:

  1. In “sage”-mode, which means that you are basically running “sage”-code: A dialect of python designed specifically for the stuff that sage is good in. Of course you don’t need explicit imports for linear algebra!
  2. In “python”-mode, which basically gives you standard python, but you can access sage-specific functionality “like a library”, i,e. by importing and using a somewhat pythonic API.

The “python”-mode exists for technical reasons: Ideally sage would be usable as a normal library/package. But there is some necessary runtime support, i.e. the sage package only works in the sage fork of the cpython runtime.

That way, you get to eat your cake and have it.

A julia analogue would be: Ruthlessly excise code from base, and have a meta-package, maintained and blessed by the core team, that re-exports relevant functionality. Then one could get the “batteries included” experience by a simple using Batteries.

In such an approach, it is essential that there really is one and only one Batteries meta-package that is documented and blessed and tested and owned by the core julia org.

Community owned competing batteries-like packages are not inherently a problem. However, the people who want the matlab experience must not be bothered to make a choice on their batteries-like metapackage, or to figure out the necessary imports for all functionality they need. A simple using Batteries is pretty much the max of inconvenience one can force on them.

3 Likes

That can be an interesting option.
A potential benefit of trimming down Julia is, it will maybe reduce resources (time and money) on CI/CD, developer’s time?
I am ignorant on the topic but probably breaking down these things (moving documentation and testing to their own repo) can also save resource? For example is it really necessary to run tests and Julia build suite when a typo is corrected in the documentation?
I am wondering what are the core developer’s views on the topics. Whether it is a topic they have in mind? And whether Julia growing installation size is not really a cause for concern?

5 Likes

I think it’s possible also to “have a much smaller Base” as well as having a larger sysimage option; imagine juliaup add release --bundle=linalg vs --bundle=lightweight

10 Likes

FYI: “Type piracy” is an almost 10 year old term (in Julia-land, and not known to me either before that, since it’s I believe Julia-specific, at least not a common CS term):

https://groups.google.com/g/julia-users/c/gGbaUVETvwQ

LinearAlgebra has the “excision” label (there):

At least to discuss how, or how difficult. Many more have that label at JuliaLang, but since LinearAlgebra already has its own repo (recently), it’s now the only issue there with that label.

It’s easy to get rid of it in a braking change, and that what would happen because of the type piracy, if just dropped in a naive way. It’s also easy to keep it fully non-breaking (I’ll not repeat myself here, I see others have recently with very similar ideas).

FYI: Other examples:

Note, Julia has already rewritten all of libm from C into Julia, so that will have no change. As I recall it’s only 32-bit Windows holding back for obscure reasons.

HOWEVER, I think we might want to go back to a C library libm. Off-topic for here, but only *, /, +, - and sqrt are correctly rounded in accordance with IEEE. IEEE doesn’t demand more, for arbitrary math functions, since it was thought impossible to do (and fast), but a recent libm did so. I believe this one for Float32:

and I’m just now seeing something for Float64:
https://www.worldscientific.com/doi/10.1142/S0218194023500675

There’s also:

We may think we want sqrt available, it’s a common function. But by defining it in Julia, it’s generic, and it also apples to a square root of a matrix (not element-wise, that’s yet another operation), and looking at the code for it, it brings in a lot more obscure (to me) code. I’ve never had to do it, at best square a matrix or do integer power of. I’m not sure how common roots or arbitrary powers of a matrix are… [And trivia, one of the quantum gates are square root of NOT.]

Let us call the resulting logic gate the square root of NOT […]
It may seem reasonable to argue that since there is no such operation in logic, […] But it does exist!