No documentation for the state of patches for dependencies of Julia

doronbehar · August 11, 2020, 8:19am

I’m trying to package Julia for my distro (NixOS) in the best possible manner. We’d like to use our own dependencies as much as possible, but I’m having trouble figuring out your policy and your status regarding this. Here are some questions:

Why do you keep patches for multiple versions of llvm ? (https://github.com/JuliaLang/julia/tree/master/deps/patches)
Do you keep track of what patches you have offered upstream to include, and did you get a reply? I tried to ask that e.g for gmp’s patches here but got no reply.
What’s the status of the p7zip patches? Is there no release available with these patches included in mainline GNU/Linux distributions? If you don’t need them, why keep them in the patches directory?
If you feel so comfortable with patching the dependencies you use, and there are so many of such changes you perform on deps you use, and upstream doesn’t include these changes, wouldn’t it be better to use forks, and make it easy for everyone including yourself to view the current status of your changes vs upstream?
You do support using system’s dependencies but on the other hand you don’t recommend it (docs)? For instance, you don’t explain what’s the difference between yours and upstream’s libuv…

garrison · August 11, 2020, 12:24pm

Prior (and ongoing) discussion on the NixOS side at https://github.com/NixOS/nixpkgs/issues/91930

nalimilan · August 11, 2020, 1:41pm

FWIW, for the Fedora package, I bundle LLVM and libuv, which are the two dependencies for which Julia-specific patches are really essential (Julia even has a fork for libuv). For other deps I just use the system libraries and things work fine. (Note that Fedora has an ILP64 OpenBLAS with symbol names suffixed with 64_, which other distributions generally lack, though.)

In general, except for the libuv fork, I think that patches are moved upstream. For example, if you look at the p7zip patches you mention, the files give URLs to the upstream bug tracker. If that’s not the case then it’s probably a mistake. LLVM generally moves too fast, so that by the time Julia devs have developed patches for the current release LLVM has moved to the next one and won’t backport them (which I would say is a problem with their development process). Patches for older LLVM versions are probably kept just so that it’s still possible to test these versions e.g. to check performance regressions; but you shouldn’t care about that for packaging.

If that’s useful, the RPM .spec file is here.

doronbehar · August 12, 2020, 8:13am

Thanks @nailmilan for replying.

Julia even has a fork for libuv

I inspected that fork a bit. Haven’t learned much what differences were significant between the versions, as Julia’s fork doesn’t seem well maintained. In terms of tags - libuv is at 1.38 now and Julia’s latest tag is 1.27.0. What’s even more ironic, is that upstream’s libuv’s README, prouds in the fact that Julia is a user of libuv . Is it IMPOSSIBLE to contribute whatever patches you need upstream??

For example, if you look at the p7zip patches you mention, the files give URLs to the upstream bug tracker.

So if they were accepted upstream, why keep using a bundled version of them?? ref.

LLVM generally moves too fast, so that by the time Julia devs have developed patches for the current release LLVM has moved to the next one and won’t backport them (which I would say is a problem with their development process)

That’s not accurate. LLVM maintains branches of major previous versions, for backwards compatibility. Most distros have packages such as llvm_9 and llvm_8 which are targeted to these LLVM branches. Julia’s maintainers should suggest upstream to merge their patches to whatever branch they’d like to use. Keeping patches for multiple major versions of LLVM, is the worst thing one can do. Rust deals with LLVM ideally (see e.g) - they work in parallel on updating LLVM to the next major version, but keep using previous older major versions of it, as needed.

Patches for older LLVM versions are probably kept just so that it’s still possible to test these versions e.g. to check performance regressions;

That’s what VCS is for.

you shouldn’t care about that for packaging

I do care, because I’d like to keep track of what upstream patches are still pending / open - what Julia’s maintainers don’t even do. Keeping old patches doesn’t make this easy at all.

I’ll summarize my criticism:

Don’t keep patches you don’t apply, that’s why there’s VCS.
If upstream accepts your patches, remove the patches and the makefile of that dep from your repo, and update the docs to say this external dep is safe to use from the system’s distro.
Don’t take the role of distro’s maintainers - keeping dependencies up to date is our job.
Bonus: Use a real build system, like cmake / meson (I’d recommend) / autotools .

yuyichao · August 12, 2020, 8:58am

Well, if it was that easy to get a patch accepted and back parted it would have been done more often. It isn’t.

I don’t see how this can possibly be worse than simply dropping support for all but one llvm version, which is what carrying patch for a single llvm version mean.

No it’s not. VCS is for keeping track of old versions, not for digging up patches for currently (at least partially ) supported configurations. What’s in the repo is what current works.

Now this is something that’s definitely not what’s vcs, especially not Julia’s repo is for. That’s the job of the upstream issue tracker. Afaict all llvm patches are tracked upstream. The code in the repo is for building, if something makes building Julia for some configuration harder it’ll never be done in order to full fill the role of another projects issue tracker.

All patches in the repos should be applied. Some not for default configuration. Removing all but a single configuration will not be the goal. There can certainly be ones that got missed in clean up but I don’t think that’s what you are complaining about.

This is exactly what we do. It may not appear to you like so because Llvm is literally introducing bugs faster than we can fix them. Almost all recent llvm versions introduce new bugs that needs to be fixed. Some may not be fixable without breaking change (or the upstreamed fix may not be back portable). If you want to push llvm for backporting, you are certainly more than welcome to do so. As is, it’s already hard enough to avoid regression and I generally find llvm to care much less about bug reports and patches for smaller users like us. I have had much better experiences with gcc to be honest.

Figuring out what, if any, version of dependency work is the developers job. Carrying patches is a part of it. Part of what “keep dependencies up to date” is to submit patches to either upstream or downstream. There are quite a few distro maintainers that have done that and it’s definitely welcome. I don’t think any are for llvm though. Also, for windows support, it is inevitable for developers to take the role of distro maintainer.

To summarize, almost all patches are submitted for upstream, when there is an upstream. It is not always easy to get them upstream or backported though, which is why we still need to carry patches for supported versions. Although officially only one version may be supported a few more version are generally partially supported and this should not be a problem. The repo is never the right place to figure out the minimum code needed for a single configuration. The developers main job is to figure out what work. We have limited influence on upstream and downstream and getting the patch to all the ideal places is as much as a job for us as it is for upstream project maintainer and downstream packagers. Contribution on this front is very welcome.

doronbehar · August 12, 2020, 9:55am

I’m sorry to hear that (any links to examples of such interactions?).

So you don’t support using upstream’s llvm, but you do support different versions of LLVM, only if your patches are applied to whatever version is requested? I’m not necessarily saying this is a wasted effort, but if someone is obliged to use your llvm, what might make them choose a certain LLVM version over another?

It still would have been a bit nice to see the details of the state of your encounters with upstream somewhere in the repo, because even digging Julia’s old PRs and issues related to such upstream patches doesn’t help.

I understand now the state of LLVM is complex. Regarding p7zip, I’ll note that our distro uses GitHub - p7zip-project/p7zip: A new p7zip fork with additional codecs and improvements (forked from https://sourceforge.net/projects/sevenzip/ AND https://sourceforge.net/projects/p7zip/). which is a fork no other distro seems to be using. This fork is supposed to be safer / not as broken as the original p7zip from sourceforge.

Again, if you’d have only written somewhere where are the upstream patches are submitted, I’d have been happy to nag upstream if needed, testing your patches and report in their mailing lists or GitHub’s PRs in order to get their attention - I’d love to contribute as much as I can, as a downstream packager. This is the real purpose of my request and of this thread - I didn’t want to bash you for not doing a good enough job, I only ask you to help me help you :).

yuyichao · August 12, 2020, 1:27pm

Well, mostly from how long it take for them to response to bug reports and the fact that there are a few patches that get little responses from LLVM. When I submit a bug report in similar area for llvm and gcc, the gcc one usually got reply and maybe fix much faster. This happened just this past weekend for me at 47058 – musttail emits wrong code for byval and 96539 – Unnecessary no-op copy with Os and tail call with struct argument (ref Workaround LLVM musttail bug by yuyichao · Pull Request #36981 · JuliaLang/julia · GitHub). While I can’t say how hard the LLVM bug is to fix, there’s so far zero reply even though I’ve certain provided more info in the LLVM one.

Partially, mostly for the purpose of making sure our use of LLVM is sane and you may be on your own to fix issues in the non-default LLVM version. There is only one officially supported version currently 9.

That I agree. And it’s generally possible for most LLVM patches since their names usually contain the upstream ID and LLVM uses a searchable web interface rather than a mailing list.

Sure, that’s fine. I’m not saying that we are doing a perfect job tracking these. I’m only saying that your suggested solutions are not possible based on our requirement (e.g. removing patches for different versions of llvm) and some are not practical given our resources (e.g. pushing as hard as, say, rust does). You are welcome to ask on github (better than here since there’s more context and is more targetting) about the upstream status or even submit PR’s to include links from the patch applied to the upstream URL. I can tell that for most of the LLVM ones (especially ones that applies to linux) although not necessarily for patches to other deps.

nalimilan · August 12, 2020, 1:33pm

Generally .patch files do contain links to upstream as I said. For example, the first p7zip patch has been taken from here, but apparently it hasn’t been included in any release yet despite fixing a CVE. That patch isn’t really required by Julia actually (it’s just more responsible for Julia developers to ship these for users’ security), so in Fedora I just use the system package and rely on p7zip Fedora maintainers to take care of patching these security issues. I haven’t looked for details, but these patches may have been taken from Debian (as visible on this one).

The gmp patch you spotted is an exception that we should fix and your help is certainly welcome.

Maybe the docs could be improved to distinguish between dependencies which need Julia-specific patches and dependencies that just happen to be patched to fix bugs that do affect Julia more than other software that use them. But that distinction can be blurry, as a bug may be innocuous for most users and yet create serious issues for others – and if Julia includes a test for the bug then you have to include the patch for them to pass.

StefanKarpinski · August 12, 2020, 3:22pm

I don’t really care for the attitude here, but I’ll take a crack at addressing some of your points anyway. A significant number of Julia maintainers are committers on both LLVM and libuv and we constantly upstream patches to those projects and elsewhere. However, upstreaming patches is a HUGE amount of work and often takes a lot of time and effort. If you’re bothered by this, how about helping out with the work of upstreaming patches and more significantly, convincing upstream projects to accept patches?

The same verison of Julia is regularly compiled with multiple different versions of LLVM for various reasons. Notably, it is common for Julia developers to start working on support for newer LLVM versions before they are released so that the project is ready for them when they are. There are also people who need to compile Julia with older versions of LLVM for various reasons. Any given Julia version will work with multiple different versions of LLVM. The patches are kept around and applied selectively so that Julia can be source compatible with different LLVM versions without too much futzing around.

If upstream accepts your patches, remove the patches and the makefile of that dep from your repo, and update the docs to say this external dep is safe to use from the system’s distro.

Sure, that’s a good idea. Want to help keep all this info up to date?

Don’t take the role of distro’s maintainers - keeping dependencies up to date is our job.

Our job is shipping our users working versions of Julia and its packages. Distros have, frankly, utterly failed at shipping people working versions of Julia, so we do what we have to in order to get people software that just works with minimal fuss for them. While Linux is somewhat popular in numerical computing, from our perspective, distros are pretty niche: the vast majority of users are on Windows and macOS where distros aren’t a thing. While distros may be your focus, they are not ours. Most of our users do not use or care about distros. Even the ones who do use distros have not been able to rely on their distros to get a working Julia. If that changes, maybe we’ll care more about distros.

To distros who are still shipping broken Julia, I say this: get your shit together, do what YOU need to do to ship working software, and stop telling us how to do our jobs. We are shipping non-broken software that just works to more people than you are. If Julia needs a bespoke LLVM build, ship it with a frigging bespoke LLVM build. Stop forcing broken software on your users for some stupid policy reason.

Bonus: Use a real build system, like cmake / meson (I’d recommend) / autotools.

Riiiiiiight.

simonbyrne · August 12, 2020, 4:06pm

It probably would be a good policy to require each patch to have a header with a link to the bug and upstream issue so we can keep track of them easier. At the moment you have to sort of git blame your way through it to find the rationale.

nalimilan · August 13, 2020, 8:27am

For seeing both sides of the picture (as a Fedora package maintainer and as a Julia contributor), I must stress that Julia is particularly hard to package for distros. This is mostly due to the fact that a language is significantly more complex piece of software than the average app and that it prompts improvements all over the ecosystem (e.g. with the suffixed ILP64 BLAS which also requires special versions of SuiteSparse and Arpack), at a fast rate. It also tends to exercize gcc so much that it triggers bugs when compiled with distro gcc flags.

But we could probably be more distro-friendly by documenting more clearly what packagers are supposed to do, what dependencies should absolutely use Julia-specific patches and which are less important. docs/build/distributing.md would be a good place where to put this. That file could also be made easier to find too.

garrison · August 14, 2020, 2:16pm

I am interested in NixOS for the same reason Stefan cited Nix as an inspiration behind Pkg.jl: reproducibility. Far too often, when something is broken with one’s operating system, the only real solution is to reinstall from scratch. macOS and Windows have not solved this problem in a meaningful way, to my knowledge. I am interested in NixOS, not because I have an inherent interest in distros, but because it provides a working demonstration of a future with more reliable software/upgrades.

I think it is pretty clear that if NixOS wants to ship a working version of julia, it ought to include the fork/patches of both LLVM and libuv. If this is unacceptable on the NixOS side, I would recommend that people just use the official julia binaries, rather than a broken package. Do the official binaries work out of the box on NixOS? If not, is there a simple, documentable way to use buildFHSUserEnv to get them working? This combined with Pkg.jl will get you most of the way to reproducibility.

RCHG · November 1, 2020, 10:00am

There is also another discussion thread about NixOS-Julia here. Although with a different approach from questions/answers here it might have also some information for NixOS users interested in Julia.

Topic		Replies	Views
Julia in Linux distributions Community	45	6092	January 21, 2019
Julia pre installed in Linux distro like python Offtopic question	22	673	August 16, 2024
Build Julia on NixOS General Usage question	74	12364	October 1, 2023
Why do packages run continuous integration tests on Julia nightly? General Usage github , github-actions	48	1800	July 12, 2023
Which linux distribution for Julia development? Tooling linux	59	5899	October 22, 2018

No documentation for the state of patches for dependencies of Julia

Related topics