Blog post about my experiences with Julia

I find a solution to the infamous error message on archlinux running julia from the official repo.

ERROR: LoadError: InitError: could not load library "/home/volker/.julia/artifacts/cb7fc2801ca0133a5bdea4bc4585d07c08284cfa/lib/libsundials_sunlinsollapackband.so"
libopenblas64_.so: cannot open shared object file: No such file or directory

The reason is that libopenblas64_.so is packed to /usr/lib/julia on archlinux, however, the path is not in LD_LIBRARY_PATH.
So archlinux users should add a line to ~/.julia/config/startup.jl.

ENV["LD_LIBRARY_PATH"]="/usr/lib:/usr/lib/julia"

But actually this should be done in the JLLWrappers package, see Fix `libopenblas64_.so` not found on archlinux by sukanka ¡ Pull Request #42 ¡ JuliaPackaging/JLLWrappers.jl ¡ GitHub

4 Likes

I know

https://github.com/SciML/Sundials.jl/issues/334

Hardcoding paths that are necessary only on a distribution is not a solution, it’s an indication that said distribution is not doing something right.

3 Likes

You are right. So I now set ENV["LD_LIBRARY_PATH"] in ~/.julia/config/startup.jl.

First I expected this was going to be like zverovich’s “Giving up on Julia” post but it turned out to be closer to viralinstruction’s fantastic “What’s bad about Julia”, and I bet this thread’s suggestions and edits had a lot to do with it. I do have a couple edits to suggest in the Performance section, though I expect I came a bit late for it.

I think it would be important to use the more established term “compilation/compiler latency” so people can look up more about this fascinating topic; also maybe throw in the informal “time to first plot” because of its common usage, though personally I find that to be misleadingly focused on plotting.

It is also not correct to say that interpreted languages do not compile; for example, CPython compiles source to bytecode, and though the compilation does far less work and runs really fast, it actually does do a little constant folding. It would be more accurate to say that interpreted languages “solve” compilation latency by doing much less compilation at the cost of performance optimization. Going further, most compiled languages deal with compilation latency by providing different levels of optimization and by saving compilation.

To built-in Julia’s credit, compilation-saving does happen to a signficant degree; the blog post itself notes that “precompilation” cuts down using Plots; using DifferentialEquations from 350s on the first load to 20s on subsequent loads. Compiler latency is also being improved from several different angles. The Julia blog posts on reducing method invalidations are quite interesting, and PackageCompiler’s sysimages is a way to save compilation.

I understand how editing structs fits into the compilation latency and REPL restart, but it’s a very difficult problem (see Github issue #18 for Revise.jl) that does deserve its own paragraph. Revise.jl uses backedges to track methods that get invalidated (need recompilation) because some method is edited, and this has some overhead. Type constructors are called a LOT more than methods e.g. Int(x), so tracking structs the exact same way is just infeasible bloat. This also means that if a struct could be edited, it would invalidate a lot more, possibly approaching the effect of a full restart. Scratch all that, maybe just mention Revise.jl being able to update methods but not structs via source code edits (I see that the use of includet to track source files was already clarified upthread).

4 Likes

This is not the reason for why redefining structs is hard. Julia uses the layout of a struct in the generated code and to optimize array allocation. We can redefine code, but can’t redefine already allocated data :slight_smile:

So in essence a fully qualified MyModule.MyType can’t be changed, but you can replace the module and the contained type.

10 Likes

Oh I see, I admit my understanding of the struct editing situation is spotty at best, and I was recalling this thread in particular. I take back my suggestion to edit the blog post to explain the difficulty of struct editing, though it would be worth adding a sentence to mention Revise.jl and the fact it handles editing of methods but not structs.

Replacing the module is also mentioned in Revise.jl’s issue #18 as problematic. It’s not particular to structs, preexisting code that uses the old module will still use the old module even if its name is attached to a new module, and if you attempt to using the new module via the same name, you’ll get identifier conflict warnings and “both ModuleName and ModuleName export …” errors. And as you mentioned, there’s the problem of the obsolete types’ persistent instances (a problem that is also true for the more dynamic interpreted languages). But I do wonder why there aren’t module backedges+invalidations+reruns as a way to deal with edited structs; it just seems like an obvious development from the discussion in issue #18, and rerunning affected modules seems better than a full restart.

But this is starting to be a bit removed from the main topic, so if anybody wants to clarify this stuff for laypeople like me, feel free to split this to another topic.

1 Like

you should just use juliaup :wink:

thanks, but i use this trick because i don’t want to use julia from aur.
Currently my julia just runs fine.

juliaup just installs and manages the official libraries (from the official julia page). But of course it’s just a suggestion. You should choose what and where to run as you please :wink:

1 Like

Does there exist a writeup/reference for what you consider the “right workflow”? I’d be interested to learn more.

16 Likes

Docstrings for the requested function is soon to be merged:

https://github.com/SciML/SciMLBase.jl/pull/157

Startup times for DiffEq will be greatly improved after ArrayInterface require usage is fully removed, which just has one remaining piece:

https://github.com/JuliaArrays/ArrayInterface.jl/pull/266

And now that we have a whole system for high level errors at the pre-solve stage, I added a few more. For example:

https://github.com/SciML/DiffEqBase.jl/pull/752

https://github.com/SciML/DiffEqBase.jl/pull/753

More should be coming soon as well.

21 Likes

As @patrick, I’d also be interested in any references or documents that explain this workflow (starting with Julia here, and concerned about making some of these avoidable mistakes).

@rdiaz02 and @patrick

I didn’t answer before because I don’t have that really prepared carefully.

I do have some short notes here:

https://m3g.github.io/JuliaNotes.jl/stable/modules/

and here:

https://m3g.github.io/JuliaNotes.jl/stable/workflow/

But reading such notes (and, IMHO, reading these instructions in general) doesn’t give the user the correct idea. A small demonstration is usually much better. A small video is here illustrating the use of Revise.

But, basically, what one needs is to keep the REPL open, but develop the code mostly in files, which are tracked by Revise. Put the code in functions, even one when is just plotting something, like:

# file: myplot.jl
using Plots
function myplot(data)
    plot(data, linewidth=2, color=:blue, label="test")
end

Because then you can, in the REPL, use something like:

julia> using Revise # normally started always by default in your startup.jl file

julia> includet("./myplot.jl")

julia> data = readdlm("mydata.dat")

julia> myplot(data) # this gets you the first plot, and may take some seconds

# change something in the `myplot` function above

julia> myplot(data) # this is fast now, and tunning the plot is easy

# change the data

julia> data2 = readdlm("./mynewdata.dat")

julia> myplot(data2)

#etc.

With that the responsiveness of the development is really good, better than other alternatives, because you don’t need anymore to worry about compiling anything, rerunning the script, etc. Just modify the function (myplot) in this case in the corresponding file, and run it again in the REPL.

If you are developing a package, then the same thing holds, but then the includet will be substituted by using your package, which is loaded installed as a dev package in the environment.

14 Likes

Leandro, thank you very much for your detailed explanation, and the links to your JuliaNotes (which also contain a lot of other very helpful material) and the video. This is very useful.

3 Likes

Thank you! This is extremely helpful.

Is there a good reason to use Revise instead of Pluto notebooks for smaller projects (e.g. one or two day experiments)? They seem to have the same advantage of not requiring recompiling.

“good” is subjective, I don’t do that because I think that Pluto takes too much to load. I use it for producing didactical material.

The main advantage of Revise is automatically tracking changes to code in files. Pluto doesn’t do that. And Revise and Pluto can be used together. When you have some code (in a file) you are changing and you are using those codes in a pluto notebook, Revise will continuously track the changes to the code you have written so that the next time you run the cells using those codes, the changed code will run.

1 Like

On the issue of plots in the docs, I want to put in a vote for UnicodePlots.jl.

It is a small dependency, it is clearly not meant to be the only plotting package you should use, and does the job of demonstrating a function like ‘sin’ well.

It almost does not feel like UnicodePlots is interested in competing for the crown of the #1 plotting package, so it sits outside that discussion.

1 Like

Let me start by disclosing two facts:

  1. I have not read this entire thread
  2. I only skimmed your blog post

That being said, you mention a lot of positives about Rust and a lot of negatives about Julia. I’m employed as a Data Scientist/Statistician and about a year and a half ago I started learning Rust and have built several useful tools with it. I think Rust is a fantastic language but I find Julia to be the better choice for 95% of the work that I do. There are many things I could do with Rust, but I can do them faster/easier with Julia. For example, I really can’t imagine exploring an unfamiliar dataset that’s 20 GB in size with Rust. It’s just so much easier and faster for me to do that with Julia. Another example might be exploring the predictive performance of half a dozen different modeling techniques - there’s no way I’m doing that with Rust instead of Julia.

I love Rust for building command line tools, ETL, and really for any repetitive tasks that I will need to perform over the medium- and long-term (and that I won’t need to make many changes to). It’s great for that kind of work and it really is a beautiful language that excels at certain tasks. If you are doing more traditional systems programming, Rust is the better option, hands down. If you are doing data science/stats work, I don’t think Rust even comes close to Julia.

21 Likes