Do not assume BB binary is used everywhere

I hope you’ll be happy to know we’re aware the situation isn’t ideal and we actually do have a plan to split the runtime from the dev part of the package, and maybe even optionally install debug symbols. Help with making this happen would be much appreciated.

12 Likes

Thanks. Happy to hear that. And I know the value of help. I do contribute (with lots of time) to other FOSS programs where I have more expertise.

2 Likes

@ChrisRackauckas once said, that people most discuss non-technical things, because they feel they understand.

I use his quote to itroduce my non-technical experiences. I am really happy with the current state. I really love that I just hit ] add and it works and I do not have to care. Before, there were some packages that I just did not install on my mac (some GL based libraries), because it did no compile. So I am probably happily living in innocent bubble caused by my lack of knowledge, but again, I am happy I do not waste time by searching and solving why things do not compile.

12 Likes

In the same vein as @ChrisRackauckas, a great poet: “The human spirit naturally tends to make judgements based on feeling instead of reason” - Fernando Pessoa

Or my favourite quote of all:

My mind is made up - don’t confuse me with facts.

It seems we run into this almost everywhere these days.

1 Like

Because then it’s impossible to use things together with each other. Again, as I said, having multiple sources is not a problem, forcing one to use one source is. And that what’s been happening, with all the intentional breakage on not using JLL provided binaries and claiming that any such breakage is a good thing.

The only thing it did not work for is LLVM. And working with distro is a perfectly viable option.

I don’t see how it didn’t work. It works perfectly well here without me having to do anything. Yes part of the job has to be done by the distro maintainer and the fact that the julia PPA was broken is the fault of whoever maintains it. There were other distro packages that works just fine at the time.

And that’s what I said about breaking things, and as I’ve also said before, having JLL isn’t a problem. It’s only when you starts to break previously working setup that it becomes a problem and that has happened multiple times and the build problem I’ve reported are being ignored.

I have not metioned harddrive space, but since the discussion was derailed by the comment, I’ll explicitly say that no it’s not the space problem at all. It’s when you force people to use a set of binary and regress on a previously perfectly working setup.

Also because you’ve just kicked out people that cares. And frankly, using JLL in packages is something I managed to work with, simply replacing the whole JLL package with something simpler is easier to do when it follows a consistent format. However, it’s only when people starts to assume everyone has to use JLL package that it becomes not workable anymore.

Yet another reason is that since people maintaining other languages spend time and effort to make sure they can work with non-prebuilt-binary, they currently have little problem to work with the julia one. And then see my comment above about selfishness.

No python doesn’t. I’ve yet to see many python packages that requires you to download prebuilt binary.

  1. BinDeps supports using system library and talks to the system package manager so yes it existed.
  2. Base julia wasn’t assuming anything about it before.

As far as I understand, an important pragmatic disadvantage of BB binaries compared to build from source on the target machines is that they are generic. That is, nothing like -march=native or -ffast-math is possible within BB for performance-sensitive packages.

You can build for specific microarchitectures. Also, most performance-critical libraries use cpuid, which makes targeting a specific microarchitecture not relevant

-ffast-math is discouraged (but not impossible) because of hard-to-track bugs like benchmarking ("exponent", "subnorm", "Float32") produced the DomainError when ApproxFun package is also used · Issue #253 · JuliaCI/BaseBenchmarks.jl · GitHub which affect completely unrelated code

1 Like

Sure, that makes total sense for a general repository of packages/binaries. But sometimes the user knows that in his specific case compiling a library with ffast-math is faster and shouldn’t break anything.

It would be cool to combine the artifact overrides mechanism with a huge set of existing BB recipes… That is, a user could locally edit the build_tarball script, change some options, and compile. Then all packages use this new version of dependency.

and how do you plan to propagate this to the downstream users if you develop a package that needs a _jll with a modified compile option?

Why should such changes “propagate”? I’m mostly talking about compilation flags that improve performance at expense of something else (eg portability or precision), and other users are free to use the generic binary or apply similar modifications themselves.

Ok cool.
So your problem is with package authors using BinaryBuilder, rather than BinDeps, for Julia.
Seems legit.

I still don’t understand what makes it “selfish”.

I would have thought selfish implied that Julia was burning some shared resource.
Of which Harddrive is the first the comes to mind.

Is it about maybe user mental capacity, for maintaining multiple configurations?
especially i guess in enviroments like shared clusters where software may be tweaked specially.
The override mechanism was specially designed for that with MPI as the example use-case.
I can’t say I have experience with that.
For me the mental capacity has been freed up by this.
Instead of having to nurse half a dozen BinDeps setups for different binary dependencies onto the docker images for production or CI or a new staff-member, (which every now and again break in weird ways).
It just works.

But I guess I am not the target of your comment,
since i don’t need to maintain multi-user HPC systems that have many different programming languages in use.
Just employee laptops that are 90% used just for julia (esp for computing), and docker images that literally only have to run 1 julia application at a time.
(albeit often a very large julia application)

I guess I have little to add to the conversation if it is just directed to highlight concerns of HPC system adminstorators and users.

Or have I got it wrong again, and it isn’t about hard-drive, or HPC system adminstration?

2 Likes

are we not talking about julia-related usage now? Also, isn’t the same can be said for any Linux package manager?

Problems can start to arise when you want to do inter-language calling. Then you sometimes have to make sure that the different languages are linked to the same library in order to share its facilities.

For example, suppose you are doing distributed-memory programming using MPI, and you want to call a parallel package in Python that uses mpi4py. In order for this to work, mpi4py probably needs to be linked to the same libmpi that Julia uses. Furthermore, in both cases, you often want to link to an MPI library that has been tuned for the cluster or supercomputer you are using, rather than the MPI_jll version. Fortunately, the MPI.jl package provides an environment-variable override that allows you to build using a system MPI instead of the JLL (in the future this may use Pkg preferences).

So, at least for libraries where one might share data structures with packages coming from other languages, it’s good to have a simple mechanism to override the JLL.

On the other hand, there are many libraries where this is not the case. For example, it seems perfectly fine to me that both SpecialFunctions.jl and SciPy link to their own builds of my C Fadeeva code for their complex erf or the Amos Fortran code for their Bessel functions. It wastes a negligible (nowadays) amount of disk space, the results can still be passed back and forth, and there are no other conflicts created.

27 Likes

@yuyichao There seems to be an assumption there that this was not intentional. On the contrary, it is because the authors or ygg / Yggdrasil / BinaryBuilder / etc. are well aware of the prior art, as tested by these distros, that this undertaking has been attempted. There are several key differences: unlike distros, this content is path-agnostic, supports Windows, and supports content-based versioning. There’s some distros that do one or two of those (with some like cygwin/msys and Nix coming closest), but most could not be ported to Windows, thus forcing us to make our own.

As others mention, we’ll continue to take contributions that support broader special usages! (just not as the expense of the basic user experience)

9 Likes

I was talking mostly about julia-related usage of binaries, but not sure what’s the difference in this context. It doesn’t matter whether one calls a pre-compiled library/executable from julia or from another program, it works the same.

You mean specifying custom compilation options?
Source-based package managers, such as gentoo or arch, provide a built-in way for this. Binary package managers typically don’t, but I still can download and compile anything myself with whatever options I want. This compiled version can then be used from anywhere.

I definitely noticed when CUDA.jl started downloading a lot of data for ca. 1 hour before I could use it. I bugrudgingly accepted it because I know how difficult the integration is but I wasn’t thrilled either.

In academic environments we often have restrictive Quotas; for example I’m currenlty limited to 0.8TB – and these things add up. I dont care about a small fortran library here or there (think Dierckx) but stuff like the CUDA toolkit really has an impact

EDIT: same goes for Conda – very paranoid I will rebuild a PyCall dependency and accidentally install several Gigs worth of NumPy/SciPy packages

3 Likes

the difference is Julia packages rely on binaries, if you compile a binary with some flags and made a julia package based on it, how do you plan the users to install your package (and your binary with special compile instruction).

Do you suggest Julia asks users to compile every binary each package needs?

Pacman is primarily a binary pkg manager, I would guess 95%+ Arch users don’t compile from source for every package.

you still can, ygg doesn’t replace your system lol…