Comparison of Rust to Julia for scientific computing?

For scientific computing can Rust have any advantage over Fortran?

Yes. Generic programming. Most high performance Fortran requires a preprocessor in another language because Fortran doesn’t give you the necessary tools to write fast code in Fortran.

4 Likes

There is a Jupyter kernel for Rust GitHub - google/evcxr.

I’ve written a fair share of Rust lately (juliaup is written in Rust), and in my mind the biggest thing why I wouldn’t use Rust for scientific computing is what is also one of the nicest things about Rust for systems programming: the language really forces you to handle and think about every single corner case where something could go wrong. That is fantastic if one writes a piece of code that will run on tens of thousand of machines and that just must work always, but for the typical scientific computing code that just imposes a huge amount of mental overhead that I think most folks wouldn’t want.

I could see a different situation where someone is building a production system, say some web app that has a backend that does some scientific computing. In such a scenario the static typing and all the rest of the things that the compiler forces you to handle in Rust might be welcome.

35 Likes

This can be done in languages with dependent types such as Idris when you work with the proper types. For example, from Type-Driven Development with Idris:

4.2.3. Indexing vectors with bounded numbers using Fin
Because Vects carry their length as part of their type, the type checker has additional knowledge that it can use to check that operations are implemented and used correctly. One example is that if you wish to look up an element in a Vect by its location in the vector, you can know at compile time that the location can’t be out of bounds when the program is run.

Working the bounded types and providing proofs is a pretty high onus on the programmer, so it is highly unlikely be undertaken in research scientific computing where Julia’s interactivity shines.

3 Likes

I believe Fortran has at least good array handling, and while I don’t really know it, I found: Generic programming in Fortran Wiki

So I’m not sure Rust even has the generic programming advantage, that Oscar mentioned. Maybe in practice it’s not done much in Fortran, people slow to adopt modern Fortran standards (and Fortran 2003, not even brand-new, since then Fortran 2008 and 2018)?

I’m a big believer in languages working together, and it seems Julia can reuse everything Rust comes up with (i.e. its easy to call), like it’s easy to call C or Fortran, so all libraries in those languages can be reused. Of course the whole will not be fully safe as if all of it had been done in Rust.

I can my own idea while reading and answering this thread (and posted a link to some bounds-checking research there, that I’ve not yet read…):

Gridap seems to be rapidly catching up there…

7 Likes

I played with Rust briefly, and rather liked it.
In large part because I did get the impression that software written in it would tend towards being built to last and be easy to maintain, which are things I’ve been longing for.
Rust’s traits are a great example of what I’m talking about here.

However, scientific computing is dominated by exploratory analysis and throw away scripts. Thus, Rust’s tradeoffs don’t seem like the right ones for Julia’s use cases.

I do think however that it would be reasonable for Julia libraries to be wrappers of Rust libraries, when there isn’t much to gain from being pure-Julia.
E.g., I have no desire to differentiate the reading of a CSV file, so writing a Julia CSV library in Rust seems like a solid choice. Same goes for plotting.
Of course, we already have CSV.jl and plenty of plotting libraries, so that’s just an example.

16 Likes

I don’t see how that counts as doing bounds checking in general at compile time. It enables you to do some compile-time checks.

Imagine that you hash a message input by the user at runtime, mod the hash by 256, and use the result to index a length-128 array. It is fundamentally impossible for the compiler to know whether that operation will be out of bounds. You have to perform the check at runtime.

Isn’t it the same thing as StaticArrays.jl? Probably a lot of scientific computing can be done that way (since one often works with fix discretisation sizes)… but adaptivity would be out.

In a sound dependently typed language you then have to provide a proof in the language that the result of mod is a valid index.

No, you can still index StaticArrays out of bounds (even if the size is part of the type signature).

1 Like

Rust is safer in a sense that it has a compile-time ownership based type system (for memory management), which makes it safer than C++ and C, where there’s no such check (C) or safety is enforced through the specific usage pattern of constructor (destructor) and scope rules (C++). Generally the cost of this system in Rust is minimal or even zero because all these checks are erased during compilation. The (runtime) cost is not always zero since this system doesn’t handle cyclic structures or other complicated lifetime well, so you need to maintain some extra runtime data to pass compilation. In summary, Rust is generally fast, not slow, because they don’t need to pay the cost most GC languages have to pay.

For bound checking, @goerch is correct. Eliminating bound checking automatically is impossible because it’s an undecidable problem. But in dependent typed languages, eliminating all bound checking at compile time is totally possible. It doesn’t contradict previous claim because this is not a fully automatic approach. It’s done by requiring every array access being accompanied by a proof, which is simply a witness that the index is in bounds. Note that in a DT language a proof is also a value, not something outside the system. So to access values in the array, you need to construct the index and also the proof (both are values). The proof can be erased during compilation. This may sound a bit tricky, but does work.

Back to the original question, I think it’s fine to use Rust for a numerical analysis course. Unless extensive usage of external numerical packages is unavoidable, they can simply implement everything they need from scratch, just like development in C++/Fortan. But since in a foreseeable future Rust won’t have a place in scientific computing, why not just simply force the students writing C++/Fortan/C if they prefer static languages…

3 Likes

I don’t know Rust, but does interactive development really work well here? I’m a bit suspicious since Julia is designed from the ground up for interactive use.

I think that what @sswatson is saying is that providing the proof, if possible, generating the index value of the proper index type, amounts to a runtime check, even though the bounds check itself is guaranteed by the compiler.

Back to Julia, it would be poor practice indeed to use @inbounds when indexing an array with user-supplied indices. You would likely only do it when using something like eachindex or have provided your own mental proof that index values are in bounds. Now code like that in a dependently-typed language would perform all the bounds checking at compile time–no requirement that you’ve thought through all the cases.

Curiously, a few weeks ago I asked on Zulip the same question about the possible role of Rust in scientific computing. @ExpandingMan shared this blog post he wrote some time ago where he explains his opinions about the use of Rust (and Zig) in this field: tvu-compare: rust and zig.

I don’t have direct experience with Rust (although I’m very curious about it and I’d like to learn more about it at some point), but my feeling is Rust is a good choice if you’re ok with using C/C++/Fortran in the first place: if you like that kind of workflow, then Rust can be a refreshing new option, without sacrificing speed. But if you need interactivity (probably a very common need in research), then it isn’t much useful on its own. People are still going to write wrappers in other high-level and interactive languages. This is for example the case for polars: it’s a popular dataframes implementation written in Rust, but most people use it through the Python wrapper. So no, Rust doesn’t really address the two-language problem, if you care about it.

That said, I think Rust can be a good improvement on C/C++/Fortran if someone wants to start a new low-level project from scratch, also taking advantage of the good build system. Probably we can image a future where some low-level libraries are written in Rust and we use them from Julia instead of C/C++/Fortran libraries. However, I should also point out that at the moment you can’t build cdylibs (shared libraries with C interface) for systems using the Musl C library (although my understanding is that this issue is being addressed and should be solved in the future), also you can’t build at all any Rust package for 32-bit Windows to be compatible with Julia runtime (and this issue is not going to be fixed). So cross-platform interoperability is actually a problem for platforms already supported by Julia.

10 Likes

Just one sidenote on the interactive case: Rust’s incremental compilation is very fast. Subjectively, when I am working on juliaup and am in a run-edit-compile-run loop situation, I often subjectively felt as if that was actually faster than a situation when I have to restart a Julia session…

12 Likes

If you really are sure that bound checks are your bottleneck (in some cases the optimiser removes the checks; also always profile before optimizing to avoid surprises), there is get_unchecked. In general, I do not see any reason why rust should suffer in performance compared to Julia. Rust is really fast. For example there is rustfft which at some point even beat FFTW.

I have written a fair share of rust code during my PhD, mostly writing simulation code. Your mileage might vary, but I highly enjoyed it and was very happy with the result. However, there are by far not as many specialised scientific libraries, so you are mostly on your own.

For array heavy code it is not as comfortable as Julia syntax wise. For example ndarray ends up feeling more like the numpy ecosystem. There is nalgebra as well. Const generics are a nice addition for array heavy code , as it allows to statically check your dimensions during compile time.

What I really do miss in Julia is a static type checker. When reasoning about your code it actually helps (me) a lot to nail down the data structures and types first and then implement the actual code.

I do not buy the „scientific codimg is mostly explorative and uses throw away code and therefore it’s legitimate to not care about best practices and code quality; quick and dirty for the win“ at all. It’s true that a lot of unfinished ideas are cast into code during an experimental phase. However, in my hurtful experience going through other scientist’s code, those quick and dirty codes tend to stay around and end up (their results anyway) in papers. But adhering to scientific principles also means to make sure your results are reproducible and as easy to comprehend as possible. And this includes your scientific code. Therefore, you need to write robust, bug-free, and readable code. Having a good type system that allows you to declare and specify your assumptions and invariants helps with that a lot.

In the end, in both dynamic and static type systems you have to worry about the same things, otherwise you get nasty bugs (faulty results). The difference is, with a static type system you need to be explicit.

Summa summarum: Yes, you can write performant scientific code in rust. Ergonomics and the missing ecosystem for scientific libraries can be a hassle (depending on your domain); however IMHO you tend to write better code (in the sense of robustness, reproducability, readability) if you are willing to (and you should be).

12 Likes

The culture of the community often emphasizes and pushes people to writing performant code, but the language itself does not, so it is a constant fight against entropy, pushing the other way. It requires discipline to write fast Julia code, in a way that it doesn’t for writing fast C/C++/Fortran/Rust.
Of these languages, Rust probably requires the least discipline, making it the easiest to write quality code, because of all the help provided by the language.

I did experiment with the Rust e-graph library egg, and compared it to the Julia library Metatheory.jl.
Egg was faster to compile.
Egg also ran to convergence on my problem in less time than it took Metatheory per iteration, making it several hundred times faster.

Many people within the Julia community can and do write high performance packages, but at least within the Julia community, there are also many people writing slow packages, too.
But, to what extant is that a bad thing, if people can easily – and with minimal resistance – write working, correct, code?
I heard from someone that a very popular Julia plotting library (GR.jl, the default backend to Plots.jl) was obviously written by a Python programmer without any real Julia knowledge. That it was filled with Python idioms, and changing just a few things made it 2x faster.
I think the major win here is that someone who apparently didn’t have any real Julia knowledge could still make a major (positive) contribution to the Julia ecosystem, hopefully without spending too much of their time.

I’m a speed nut and perhaps a perfectionist to the point of valuing principal over pragmatism. (Also, being a perfectionist does not mean my own code is perfect – it means I hate everything I’ve ever written for not being perfect. Please keep that in mind if I sound judgmental of other’s work; I am not trying to dismiss its usefulness and no less harsh w/ respect to my own.)
So while I would like to see a language that enforces this or holds the programmers hands to get there, I also need to remind myself that there are real pros and cons to these tradeoffs.
Perhaps in the future we can move more towards the best of both worlds, with libraries like JET.jl helping to analyze code and point out problems.
As Taku put it, writing performant code requires restricting yourself to an optimizer-defined subset of the language; would be great if we could have automated testing for this, making it easier to accept and review PRs, and for newcomers to learn
Also, it’d be easier politically to enforce requirements imposed by a tool. I’m not being a nitpicky jerk, it’s this tool imposing those draconian requirements and code changes!

Disclaimer: I only played around with Rust for a couple weeks (a few months ago), so I am very far from an expert and also do not have any experience with it (or C/C++/Fortran) in a production environment, maintaining a large code base, dealing with dependencies and regressions, etc…

Once it is written, it’ll be very difficult to justify to anyone else why it should be rewritten, regardless of quality. Thus, I do think there is a lot of value to getting it as correct as possible the first time.

35 Likes

but it’s much easier to improve a Julia code piece while staying in Julia, like many examples you see here on this forum, people can improve a fairly complex program without fully understanding the problem because the performance critical part can often be optimized.

Order of magnitudes easier than, to re-write a Python or R library in C/C++ then wrap it again in Python/R so it doesn’t break for downstream users.

I think another interesting static language choice would be Odin Language.
It has built in support for vectors and matrices with support for Generics with Multi Morphism which in practice can be very similar to Julia’s style.

It doesn’t have the exposure of Zig / Rust but it is still a very nice and well designed programming language.

1 Like