Blog post about my experiences with Julia

Hi Carsten

I am not in Germany (I live in an English speaking country though).

We are in the process of getting a new hpc but I am not allowed to talk about any details during the post tender process (it is not military classified though).

Plus configuring it correctly w.r.t. MPI, SLURM, CUDA, parallel I/O, etc to make sure Julia can actually take full advantage of hardware and (low-level) software. I’m always amazed to see amount of complexity you need to go through as a user to get, say, GPU-to-GPU transfers working at full speed. I don’t work on those things myself, but my colleagues often have to guide users to compiling and running their programs efficiently.

That’s why I think just downloading Julia on a supercomputer and running it will not immediately give you all the performance you want, especially when running large multi-node jobs.

@carstenbauer Hi! I’m co-organizing the Julia for HPC webinar at SURF, you spoke to Abel :slight_smile:

3 Likes

Why? Keep in mind that Julia is a compiler, the performance of the code it generates doesn’t depend on how it was compiled. There can be a little improvement in the performance of Julia’s runtime, but the benchmarks in Compiling Julia using LTO+PGO - #5 by stabbles showed negligible speedup when compiling Julia and all of its dependencies with -march=native. You mention large multi-node jobs, but if you’re referring to MPI, the package MPI.jl does dynamic loading, so compiling Julia locally doesn’t have any advantages compared to using a prebuilt generic binaries of Julia. Or were you thinking of something else?

No, I did not mean it is necessary to compile Julia locally. One issue is that, at least on our system, there’s quite a set of environment combinations that a user can choose (Intel versus OpenMPI), HDF5 (tied to MPI version), srun versus mpiexec, etc. E.g. even OpenMPI comes in 3 flavors, compiled with GCC, Intel Compilers and NVIDIA HPC compilers. So depending on their needs (and sometimes restrictions in the software used forcing certain choices) they will load certain versions of these and I’m still not entirely sure Julia will pick up the correct shared libraries in all cases and it will work as expected, but I haven’t looked closely at that yet.

For example, looking at Configuration · MPI.jl it seems you do need to build MPI.jl with the correct module loaded, but this then means a user will need to make sure the same module is loaded when actually running a Julia program under that MPI. But again, I haven’t actually gone through these steps myself yet, partly because it takes quite a bit of time to figure out which libraries Julia uses from the system, which from jll’s it downloads itself, if these conflict with system libraries, if all combinations work, etc.

Edit: in short, correct configuration of the environment is part of the challenge :wink:

3 Likes

As for the structure of the documentation, I am absolutely not implying that Julia should mimic Matlab (after all the reason for my migrating to Julia is that it does things in its own way), but for completeness I would like to share here that majority of their toolboxes have these four pieces in the documentation section:

  • Release notes
  • Getting started
  • User’ guide
  • Reference

I am confident that a better system can be designed and implemented, but this one works fairly well, I think.

Oh, ok, configuration is certainly an important component! Note that documentation if the development version of MPI.jl has also notes for HPC clusters sysadmins.

BTW, we have a Julia HPC working group meeting every fourth Tuesday of the month, and how to simplify configuration is a recurring topic. It’d be great if you could join us if you want to bring your experience or share your thoughts! See also JuliaHPC Meeting (we don’t use Google meet anymore, but the document with the agenda should have the link for joining the call)

2 Likes

Does this course delves deeply into HPC side of things and is it possible to participate in it online even if you are not in Stuttgart?

No. This is an in-person workshop which, given that it’s the first event at HLRS, is primarily intended for people with basic HPC knowledge (in any language) who are not particularly familiar with Julia for HPC. There’ll be a website announcement soon.

I don’t think confidence in the language should be tied to the choice of not including a particular package in the official docs. Rust has basically the same strategy when it comes to packages: it doesn’t include fundamental packages in the standard library. Not even rand is in the standard library, or tokio, which is used for async stuff. And let me tell you, Rust is very far from being a language people have little faith in, since it is being used gradually everywhere. And besides, I’ve also had a similar experience with Rust, having to dig the source code to understand how libraries work. It’s the nature of new, fast growing languages. I’m not saying “deal with it”, but it’s nothing unheard of.

2 Likes

Julia and Rust are targeting very different audiences. Julia in many ways caters to scientific and numerical computing, where many users are not professional developers.

What may need to happen is the creation of Julia distributions in the same sense that there are Linux distributions. These distributions could include a wider range of standard packages as well as custom system images with those packages compiled in. A distribution could include Plots.jl and GR.jl built-in for example and be optimized to minimize time to first plot.

4 Likes

Fortunately it’s easy to do this yourself

1 Like

I find a solution to the infamous error message on archlinux running julia from the official repo.

ERROR: LoadError: InitError: could not load library "/home/volker/.julia/artifacts/cb7fc2801ca0133a5bdea4bc4585d07c08284cfa/lib/libsundials_sunlinsollapackband.so"
libopenblas64_.so: cannot open shared object file: No such file or directory

The reason is that libopenblas64_.so is packed to /usr/lib/julia on archlinux, however, the path is not in LD_LIBRARY_PATH.
So archlinux users should add a line to ~/.julia/config/startup.jl.

ENV["LD_LIBRARY_PATH"]="/usr/lib:/usr/lib/julia"

But actually this should be done in the JLLWrappers package, see Fix `libopenblas64_.so` not found on archlinux by sukanka · Pull Request #42 · JuliaPackaging/JLLWrappers.jl · GitHub

4 Likes

I know

Hardcoding paths that are necessary only on a distribution is not a solution, it’s an indication that said distribution is not doing something right.

3 Likes

You are right. So I now set ENV["LD_LIBRARY_PATH"] in ~/.julia/config/startup.jl.

First I expected this was going to be like zverovich’s “Giving up on Julia” post but it turned out to be closer to viralinstruction’s fantastic “What’s bad about Julia”, and I bet this thread’s suggestions and edits had a lot to do with it. I do have a couple edits to suggest in the Performance section, though I expect I came a bit late for it.

I think it would be important to use the more established term “compilation/compiler latency” so people can look up more about this fascinating topic; also maybe throw in the informal “time to first plot” because of its common usage, though personally I find that to be misleadingly focused on plotting.

It is also not correct to say that interpreted languages do not compile; for example, CPython compiles source to bytecode, and though the compilation does far less work and runs really fast, it actually does do a little constant folding. It would be more accurate to say that interpreted languages “solve” compilation latency by doing much less compilation at the cost of performance optimization. Going further, most compiled languages deal with compilation latency by providing different levels of optimization and by saving compilation.

To built-in Julia’s credit, compilation-saving does happen to a signficant degree; the blog post itself notes that “precompilation” cuts down using Plots; using DifferentialEquations from 350s on the first load to 20s on subsequent loads. Compiler latency is also being improved from several different angles. The Julia blog posts on reducing method invalidations are quite interesting, and PackageCompiler’s sysimages is a way to save compilation.

I understand how editing structs fits into the compilation latency and REPL restart, but it’s a very difficult problem (see Github issue #18 for Revise.jl) that does deserve its own paragraph. Revise.jl uses backedges to track methods that get invalidated (need recompilation) because some method is edited, and this has some overhead. Type constructors are called a LOT more than methods e.g. Int(x), so tracking structs the exact same way is just infeasible bloat. This also means that if a struct could be edited, it would invalidate a lot more, possibly approaching the effect of a full restart. Scratch all that, maybe just mention Revise.jl being able to update methods but not structs via source code edits (I see that the use of includet to track source files was already clarified upthread).

4 Likes

This is not the reason for why redefining structs is hard. Julia uses the layout of a struct in the generated code and to optimize array allocation. We can redefine code, but can’t redefine already allocated data :slight_smile:

So in essence a fully qualified MyModule.MyType can’t be changed, but you can replace the module and the contained type.

10 Likes

Oh I see, I admit my understanding of the struct editing situation is spotty at best, and I was recalling this thread in particular. I take back my suggestion to edit the blog post to explain the difficulty of struct editing, though it would be worth adding a sentence to mention Revise.jl and the fact it handles editing of methods but not structs.

Replacing the module is also mentioned in Revise.jl’s issue #18 as problematic. It’s not particular to structs, preexisting code that uses the old module will still use the old module even if its name is attached to a new module, and if you attempt to using the new module via the same name, you’ll get identifier conflict warnings and “both ModuleName and ModuleName export …” errors. And as you mentioned, there’s the problem of the obsolete types’ persistent instances (a problem that is also true for the more dynamic interpreted languages). But I do wonder why there aren’t module backedges+invalidations+reruns as a way to deal with edited structs; it just seems like an obvious development from the discussion in issue #18, and rerunning affected modules seems better than a full restart.

But this is starting to be a bit removed from the main topic, so if anybody wants to clarify this stuff for laypeople like me, feel free to split this to another topic.

1 Like

you should just use juliaup :wink:

thanks, but i use this trick because i don’t want to use julia from aur.
Currently my julia just runs fine.