Blog post: Rust vs Julia in scientific computing

At a much lower level of complexity, it benefits my simple package ExpandNestedData. I’m able to traverse the nested data in a dynamic, type unstable initial loop, then compile custom iterator functions for each resulting table column. I get no allocs, and I can handle whatever weird datatypes the user needs. That might be possible in Rust, but it was so easy in Julia.

And it shows how dynamism and Julia specifically can benefit applications outside scientific computing.

6 Likes

Thanks for the post, @Mo8it

There has been some discussion on the post on the Julia Slack. I’ll summarize my view here. As a research software engineer who likes both Julia and Rust, I’m very interested in the topic (I also shared your post on Hacker News). Unsurprisingly, I reach a very different conclusion as you - I think Julia is far superior to Rust for science!

Let me first say that I agree with most of the concrete points you raise in the post. In particular, I agree with the following points:

  • Rust provides more and better static guarantees than Julia, especially guarantees about concurrent programming such as multithreading, and these guarantees prevent some bugs in real life.
  • Rust’s error handling is more robust in the sense that it leads programmers to write code with emphasis on the edge cases, again producing more robust code.
  • Rust’s model of traits and interfaces is far superior to Julia’s abstract inheritance (although I believe Julia could not and should not copy Rust here - Rust’s trait model would not be practical or desirable in Julia). Specifically, Julia’s lack of discoverable and enforcable interfaces is grating.
  • Rust tends to be faster than Julia because of an increased emphasis on zero-cost abstraction.
  • Some of Rust’s tooling, (rust-analyzer, clippy, autogenerated docs) is far superior to Julia’s

I would also add more point in favor of Rust: It’s way easier to just download and execute a Rust program than it is to run a Julia program, because right now, there is no concept of an “app” or “executable package” in Julia, so you need to be fairly familiar with Julia to even run a Julia program (e.g. make a virtual environment yourself, install pacakge into that env, then make a script that calls into the package, then run the script in the right environment with --startup-file=no etc. etc).

However, I disagree with some major points in your post, and I think there are other important points that you have omitted when comparing Rust and Julia:

Most importantly, I think you wildly understate the importance of programmer productivity in scientific programming. It’s not a coincidence that Python (and previously, Perl) are dominant in the field and not C++ depite all the static guantees C++ can provide. For scientific programming in particular, the ratio of code to available development hours is high, so it’s critical to have a language that allows you to write code quickly, and Rust simply does not do that. The language is slow and boilerplate-heavy to write.

Second, scientific computing needs a language where it’s fast to iterate on an idea and quickly pivot, changing an exiting program on the fly. Julia excels at this, whereas Rust is particularly bad at it. In practise, even subtle changes to a type or to the ownership model in Rust tends to cascade through the entire program, requiring you to rewrite a large chunk of your code for trivial borrowchecker reasons. More generally, while the borrowchecker does prevent a whole class of bugs, in like 90% of cases, it just gets in your way pointlessly by preventing you from writing correct code the borrowchecker simply can’t reason about, and which would be solved much easier and more elegantly with a garbage collector. Garbage collection does have some downsides - for scientific programming in particular that it can be slow in certain situations and it can hog memory - but in most cases, it simply solves the problem of memory safety in a single, elegant, fell swoop.

Third, I think you misunderstand what we mean by the “two language problem”. What we don’t mean is that Julia can take the place of every other language because it’s good at everything such that no other language is ever needed. What we DO mean is that Julia is both fast, convenient and flexible enough that you don’t need to prototype and iterate in one language, then implement the final solution in another language due to speed. Which, practically speaking, is what tends to happen. At least in my experience, people don’t rewrite Python to C++ because they want the static guarantees - they do it because Python is too slow.

Then there are a few smaller points

  • Rust doesn’t actually prevent race conditions. It prevents some of them, sometimes (i.e. you can’t have two threads mutating the same variable, but you can absolutely have deadlocks). In general, I find that Rustaceans tend to argue in absolute terms, saying stuff like “Rust is correct”, or “Rust prevents data races”. The true but nuanced “Rust is somewhat safer, and prevents some data races”, while more accurate, is just less sexy. Speaking of which - sure, Rust is faster than Julia because it places emphasis on zero-cost abstractions, but Zig is faster than Rust because it places even more emphasis. So, should you use Zig? Probably not - because both Julia and Rust is fast enough for nearly all practical purposes.
    EDIT: As you mention below, your blog post is careful to mention that Rust only prevents “data races”, which are indeed prevented in Rust, not race conditions in general. You are completely right.
  • In your JET example, you don’t use it the way that is recommended in its documentation and its README page: You need to check for dynamic dispatch with @report_opt before you use @report_call. If you use JET correctly, it will indeed warn against the problem in your example.
  • Interactivity is not just nice when doing stuff like plotting. It’s also useful when programming in general. For example, it’s way faster in Julia to try out different implementations of a single function, or test a single function in the REPL, or to to benchmark or iterate on a small part of the program, because you can do it interactively. The number of times I’ve wished I had a REPL when debugging a Rust program…
  • You don’t mention that Julia tends to encourage code reuse, genericness and code sharing much more than Rust. I believe this is pretty important in scientific code. It’s way easier for me to break open someone else’s Julia code and reuse it than for Rust, because Rust programs tends to be tangled together more. Which again, is really useful for scientific code in general because we ought to reuse each other’s work more.

So in conclusion, while I agree with many of the concrete points, I myself believe that 1) you understate how Rust, by being extremely pedantic, makes writing code slow and ossifies the development process, while 2) also understating the advantages of Julia’s dynamicness and its static capabilities.

Rust is probably superior for writing fundamental command-line tools where you can afford to spend a lot of development time up front, and you really need the guarantee Rust provides, e.g. programs like minimap or samtools, but Julia is superior for most scientific programming.

Of course I hope that Julia’s static guarantees improves, and the many, many correctness footguns in Julia will be reduced over time :slight_smile:

86 Likes

I think you mean “doesn’t actually prevent race conditions”: as explained in the page you linked, Rust prevents the special case of race conditions called data races, but not other kinds such as deadlocks (and of course it allows date races through unsafe blocks).

6 Likes

Just wanted to add a few points to the post by @jakobnissen, which already contains most of what I would want to say…

People don’t decide to use C/C++ just for performance, but also for a better project scalability and elimination of many classes of bugs at compile time.

See:
image

https://arxiv.org/pdf/1901.10220.pdf&#:~:text=RQ1%20“Some%20languages%20have%20a,were%20associated%20with%20more%20bugs.

C/C++ just up with Python and a bunch of other dynamic languages for bugs per commit… From that statistic, it would be pretty weird to move from e.g. Python to C/C++ to “eliminate bugs”.
Besides that paper, this matches my experience with writing in static languages and I’ve seen lots of other articles/papers confirm this (I’ve written the most horrible bugs of my career certainly in C++).

This doesn’t defeat the argument, that static languages can be nice to avoid certain categories of bugs, but there are clearly way more factors to bug free code, than having a static checker which usually comes with big trade offs…

E.g. I’d much rather have HTTP.jl written in Rust, since there are so many corner cases that can happen at runtime, which are really fatal for serving a website, that it would be worth it to deal with Rust.
But e.g. writing a package like Makie.jl is pretty impossible in Rust, and I don’t think I would profit a lot from the static checker.

As mentioned above, well optimized Julia code can get close to the performance of Rust, but it will never reach its performance.

That’s pretty much like saying electric cars are faster than combustion cars. Of course there are situations where having no GC means you end up faster, but there are also lots of problems, that are more efficiently solved by having a GC or simply have the same performance ceiling.
Also, Julia is a language that allows you to opt out the GC much better than most other GC languages, so it seems especially untrue for Julia. Back in the days, I did lots of benchmarking of Julia against C/C++, and it was almost always possible to reach the same performance as heavily optimized C code, so I’m pretty optimistic it’s the same for Rust in most situations.

On the other hand, if you want the highest performance in Julia, you have to write a for loop.

Julia also has no cost abstractions as a selling point… I haven’t benchmarked Rust against Julia a lot, but I’d be surprised to see large differences between Julia’s iterators and Rusts… Maybe you run into a performance bugs in some specific implementation, or was this just an assumption without ever benchmarking it?

24 Likes

Thank you for indirectly assuming that I am not aware of the Julia ecosystem. I actually am, but you did not read the whole blog post and did the cherry picking that you criticize :slight_smile: Because I have an appendix about JET.jl

It is nice that you can get a bit closer to the Rust version in Julia. But you are missing the point of having proper sum types as first class citizens in a language that tells you what a function exactly returns in its signature.

I think that although Any provides flexibility in dynamic languages, collections of Any is a big mistake in a language focused on Performance.

Rust has enums which are sum types that can provide you with that flexibility:

enum Color {
    Gray(u8),
    RGB(u8, u8, u8),
    RGBA(u8, u8, u8, u8),
    Named(String),
}

// `colors` has the type Vec<Color>
let colors = vec![
    Color::Gray(42),
    Color::RGB(10, 0, 20),
    Color::RGBA(0, 0, 255, 100),
    Color::Named(String::from("red")),
];

There are also trait objects which you can use if you only care about the structs implementing some behavior (trait).

This is what I teach in my Julia courses, but it still happens that a vector does not end up with a concrete type that you expect and Julia will not even warn you about it. This is what I did mean with " Have fun profiling, using the macro @code_warntype while interactively calling a function, etc."

I think that Julia should have taken the approach of Rust of deriving the type of the vector without even having to specify it like in the following example:

// The type of the vector is not known yet, but it will be derived.
let mut v = Vec::new();

// We push a float with 64 bits, therefore v has the type Vec<f64>.
v.push(1.0);

// The type will not change!
// The line below will give a compile error because we are pushing an integer into a vector of floats.
// v.push(1);

You can specify variable types in Rust, but you don’t have to in most cases.

Then don’t use collections of Any when writing Julia for performance-oriented applications? I don’t really follow how the mere existence of a slow option is a point against a language when fast options are just a few keystrokes away (and are often the default)

I think that Julia should have taken the approach of Rust of deriving the type of the vector without even having to specify it like in the following example:

You can specify variable types in Rust, but you don’t have to in most cases.

Julia already does this, except for non-const globals. maybe you are using global variables, which is the subject of the very first performance tip

5 Likes

I think it is not the case to get personal here - of course, I read the whole article. The fact that you did mention JET and failed to recognize that it undermines some of the comparisons you were making is not fair.

In a way, the mere fact of mentioning JET and ignoring and/or misusing it is worse than not mentioning it at all. But hey, you admitted you are biased from the beginning - so what more is there to say?

Also, it is not fair to misuse JET in making an argument against Julia / pro Rust:

You don’t need to sell me Rust - it is in my nature to love Rust - I used F# for 8+ years before starting coding in Julia, and I know well the pains of giving up the type-safety (at least the kind of type-safety that Haskell, F#, and Rust can offer).

You can write a very good set of arguments on why Rust eliminates certain Julia pain points for the developer. And everybody can agree on that - but remember your resolution/proposition: scientific computing + the two problem languages.

Now, if that is the subject, then your arguments need to be evaluated in the context of your proposition.

And it is my opinion that the arguments you picked (while can favor Rust as a good/nice language - a resolution which is accepted by most Julia developers - at least this is my impression while reading related topics) have nothing to do with showing that Rust is a better pick for scientific computing than Julia. And it is pointless to mention that for certain problems, a language can be a better pick than another - that doesn’t bring any new information to the discussion (it has been a known/accepted thing for ages).

And more important: showing that Rust is great is not a statement about why Julia didn’t solve the two languages problem. You need to discuss Julia - and show how people are prototyping in Julia and then switch to implement the solution in C/C++/Rust to make it fast :stuck_out_tongue_closed_eyes:

13 Likes

Some points out of my head about why Julia can not reach the performance of well optimized Rust code:

  • GC: I doubt that you can avoid the garbage collector completely in a real project.
  • Zero cost abstractions. I had an example in my blog post that iterators as an abstraction over good old for loops are not slower that these for loops, but sometimes even faster because of SIMD, avoided bounds checking etc.
  • Low level features that are missing in Julia. I did mention one example in my post when talking about capacity.

Here are some benchmarks that I found, but I did not check the code:

In one of the benchmarks, Julia is better, but only by 4%. In this benchmark, it uses 31x times the memory that Rust uses.

But I don’t like micro benchmarks. I think it would be fun to do some benchmarking on a real problem. You can suggest one :slight_smile:

My subjective experience is that I am able to optimize Rust code much easier than Julia code because of the points above and because there is no hidden complexity. You know for example for sure if you are passing a value by reference or if it is cloned. The complexity of passing is not hidden in Rust which makes it not that easy to start with, but in the long term, it is much easier at least for me to optimize it.

I can blindly believe you. Julia in general supports writing efficient code (if we ignore the performance footguns).
Rust takes it to the next level but you have to pay for that extra. I definitely understand that not everyone has to get more low level than with Julia, especially in scientific computing.

1 Like

So does Julia. In fact, there’s an entire proof that its devirtualization scheme is correct, described in full detail.

https://arxiv.org/pdf/2109.01950.pdf

Are SciML solvers not a real project?

And Julia also has quite a bit of these. For example

You can inspect the LLVM code and know that it’s zero cost, so unless you believe in magic that’s zero cost.

Broadcast on iterators can SIMD just fine. See FastBroadcast.jl which

with @simd ivdep its broadcasting matches manual loops with SIMD enabled.

There is stuff to improve with Julia, but come on some of this is just the basics.

21 Likes

Could the main takeaways from this thread end up in a less biased blog post that compares Julia and Rust for scientific computing with input from both communities?

7 Likes

Agreed. Show me a full Rust benchmark on Bruss.

6 Likes

Hi Chris, first of all, I am a big fan of you! Thank you for your contributions to the Julia community :heart:

I am wrong then in that point, I will edit my reply. It was something out of my head and I only quickly checked the Wikipedia article about multiple dispatch which says that it is done dynamically using a vtable.

I should be more specific in my subjective guessing. I did mean big projects that are not a library. At least from my experience with BenchmarkTools.jl, it was very hard to keep the time taken by the GC under 10%. It depends on the project.

9 Likes

Yes! Indeed, I did add the warning about my bias in the introduction and did say at the end that you should write me especially if you are more biased towards Julia.

Proposals for editing are welcome! :heart:

You could even submit a pull request: website/index.md at main - website - Codeberg.org

Here are the first two edits after this discussion:

10 Likes

I highly recommend reading that Jan Vitek article, or watching his JuliaCon talk:

I don’t think it’s an exaggeration to say that the entire point of function specialization in Julia is to allow for call site devirtualization on multiple dispatch. The key here is that semantically dispatch is handled through look ups to the function table, though the compiler is allowed to optimize the dynamic behavior away if it can prove that the dispatch result is known at compile time. Then what is proven is that type-groundedness (or basically type-stability, in a clearer type-theoretic way) is exactly the property that’s required to correctly prove that dispatch results are known constants.

Therefore, the reason why Julia is fast is because function specialization + multiple dispatch allows for dynamic looking semantics to give the compiler enough information to prove and remove all dynamic dispatching.

In other words, Julia always looks like it has dynamic dispatch, though the compiled code (for “good code”) does not have dynamic dispatch and will statically dispatch. This of course is “under the hood” and not necessarily semantics of the language, it’s a compiler optimization, but it’s the important one that makes Julia code compile down to something like C. Because when this is done correctly, the compiled code is effectively static.

Depends on the project, but at the same time the core numerical kernels don’t tend to be where the issue is. Making a code use in-place operations is straightforward, what is missing is enforcing that a kernel never allocates so that future maintenance of a code gets errors and test failures if that property is ever lost.

Note that while saying this, I am currently drafting ideas and designs for more static subsets of Julia and with more direct control over allocation behavior, so I definitely agree that there are currently some benefits of languages like Rust or C++ that I would like to see in a near future Julia. But you should be careful about where exactly this boundary is. Also the Julia allocator is a bit slower right now than it should be, but there’s discussions on how to address that by v1.11.

28 Likes

LOL, I did say that DifferentialEquations.jl is objectively the best differential equations solver we have. I said it in my tiny talk and in my blog post. I was referring to this article by you (I did say that I am a fan, right :wink:). I will not try to compete with such a good package.

The ecosystem is much better for differential equations in Julia.

I meant something that you can implement without major third party libraries :smiley:

3 Likes

But I did not use the sum method in rayon. I did use rayon just as an equivalent of using @threads in Julia.

I know that the trivial problem can be solved much more efficiently with reduction (which sum does). The problem is so trivial that you can even just write down the solution of the computation with the silly loop. It is just a demonstration of multithreading effects :wink:

1 Like

basically you’re just looking for a project that’s bigger than microbenchmark but small enough that you can compare different languages apple-to-apple-ish, here’s one https://indico.jlab.org/event/459/contributions/11540/attachments/9410/13651/CHEP2023%20Polyglot%20Jet%20Finding.pdf

this is a track clustering (into “jet”) algorithm, reference implementation is in C++ because particle physicists love making life harder for themselves. In this work you see the same algorithm being implemented in Numpy and Numba and Julia, and in fact in one case Julia is faster than C++ because C++ missed a SIMD optimization.

I’m sure you can get Rust to work, but I doubt you can be significantly faster than C++ :man_shrugging: . My point here is this is a real scientific computing example where Julia has same speed as the best out there while only needed a faction of developing effort.

(note: FastJet C++ for many many years compiled with --ffmath which when linked together with millions lines of other physics code, probably produced many errors)

15 Likes

I don´t like micro-benchmarks either (I liked them before being convinced that it is possible to tune any of the “fast” languages to compete with each other at the same level, including Julia of course).

For that site, specifically, I had previously looked at the nbody simulation example, which is more or less on my domain. Unfortunately I cannot reproduce the Rust speed there:

I tried to build the faster rust example (4-i.rs) following some instructions, and get:

Finished release [optimized] target(s) in 0.07s
~/Downloads/test4/target/release% time ./test4 5000000 
-0.169075164
-0.169083134

real	0m5,389s
user	0m5,388s
sys	0m0,001s

While the Julia code runs with:

~/Downloads% time julia 7.jl 5000000
-0.169075164
-0.169083134

real	0m1,271s
user	0m1,323s
sys	0m0,350s

#or, from within Julia
julia> include("./7.jl")

julia> @btime nbody!($bodies, 5000000)
  243.013 ms (0 allocations: 0 bytes)

So yes, I don’t like micro-benchmarks, they are too far from real experiences, and full of bad practices on many languages. In all threads I followed where experienced people were competing to get the best for each language, at the end most get almost the same thing, with some specific exceptions in which achieving the best performance might be harder in or other language (Julia, for instance, is harder if one is dealing with collections of different types, which might trigger dynamic dispatch are requires some workarounds).

What I did do, personally, is having an implementation of cell lists for computing pairwise properties of systems of particles, which can compete with extremely optimized C++ packages or the SciPy alternatives (in C++ I guess). So, in a real-world scenario, for me Julia did well in terms of performance.

This is another more real world case that ended up being optimized in a thread here: Reducing the ecological impact of computing through education and Python compilers | Nature Astronomy

8 Likes

One point that is often overlooked in discussions around dynamic vs static typing or prototyping vs “large system” programming, is nicely captured by this quote from Paul Graham (from OnLisp):

You don’t just write the same program faster; you write a different kind of program.

Imho, this is not just the case when you can quickly and interactively explore ideas, e.g. on Lisp or Smalltalk, but also when a language facilitates thinking more abstractly about your application domain, e.g. Haskell. In this respect also shorter code, i.e., with less boilerplate destructing from the actual problem, is invaluable, e.g., taken to the extreme in APL or J. Also Julia has a lot to offer in these respects.
Furthermore, also the following P. Graham quote applies nicely to Julia and the two-language problem in general:

In less abstract languages, you work for functionality. In Lisp you work for speed. Fortunately, working for speed is easier: most programs only have a few critical sections in which speed matters.

1 Like

Hi @Mo8it,
Welcome on the julia discourse.
Thanks for posting here your interesting blog article and video
and asking for feedback. I learned a lot.
I just hope you don’t get discouraged by the many responses and comments.
I think some comments and tones coming from some people are a little harsh for no reason.
I hope you stick around for more discussions about scientific Rust.

Best,
Olivier

18 Likes