Just curious if anyone has ever raised the question of introducing Rust inside the Julia source tree, alongside C? I know that the Linux kernel has recently started this in their C source to improve memory safety: Rust for Linux - Wikipedia. Since Julia’s GC is now multithreaded, and compilation is moving this way as well, I feel like the memory safety guarantees of Rust compared to C would help prevent any issues that could arise. Having experienced several bugs from race conditions inside Julia itself from 1.11’s multithreaded GC alone (#56761, #56759, #56735), I feel like the increasing complexity of the Julia runtime would be well-supported by moving parts of it to Rust, where there is more memory safety. In addition, the Julia community seems pretty on board with Rust, since juliaup is written with it.
(I do realise that this decision would lie with only a handful people though. Just planting some seeds.)
I think this didn’t show up on 1.10 because Memory hadn’t been introduced at that point, even though the underlying data structure had a memory leak (since my code allocates tons of arrays). It sounds like we still don’t know what that bug was from though, only there was some leak from that structure(?). Rust would help greatly for such leaks, via ownership/lifetimes, no?
This sounds pretty interesting. Is there a devdocs guide somewhere? Or maybe it’s literally just landing?
The first part (adding support for MMTK if you compile your own copy of it) landed yesterday, and the 2nd part (binarybuilder support so you can get it if you compile Julia regularly with the right build flag set) is the PR I linked which isn’t merged. Expect more devdocs on this sort of thing closer to release.
It sounds like we still don’t know what that bug was from though, only there was some leak from that structure(?). Rust would help greatly for such leaks, via ownership/lifetimes, no?
I can think of two possible explanations for this bug.
One explanation (unlikely), is that we somehow got the list implementation wrong. This seems like a software engineering issue that could have been solved by implementing a high-quality container inside Julia and covering it with extensive unit testing, or using a well-tested implementation provided by some language’s standard library (e.g. C++'s STL).
Rust could have helped here, but it would have helped just because it’s a language with a rich and well-tested standard library, not because of its memory safety properties (after all, the linked list implementation provided by their standard library has a considerable amount of unsafe code that manipulates raw pointers). It doesn’t differ from C++ here.
The other explanation for this bug that I can think of is that the list has a very poor layout that’s fragmenting Libc’s allocator and making it request more and more pages. This is an issue of the underlying allocator and we could be vulnerable to that even if we used Rust.
This isn’t correct - you couldn’t write such code unless you were to write unsafe { ... }, which loses all memory safety guarantees provided by the compiler and is almost always avoided. In other words this would not be normal Rust code.
Safe Rust doesn’t allow you to write code exactly as you would in C - because such patterns aren’t memory safe. It requires a redesign to satisfy the safety requirements. Which helps prevent issues such as leaks and races.
One problem no one raised about Rust is that in BinaryBuilder we don’t have Rust toolchains for i686-w64-mingw32, aarch64-freebsd and riscv64-linux, because they are either unsupported by rustup or have incompatible runtimes with what we use. Which means we can’t compile Julia dependencies for those platforms, which would significantly complicate Julia build system there.
Users definitely routinely call code with unsafe blocks even if they don’t write any personally. Pushing to a vector has one, a blogpost estimated 7.5k/35k of standard library functions are unsafe. Everything gets unsafe deep down, the advantage of Rust is that its idea of safety is a statically knowable language semantic.
That thread is different. It’s about rewriting Julia in Rust. I’m not suggesting that. Just introducing Rust into the source tree, like what Linux did.
(That thread also looks to not motivate their question by anything, whereas this thread is motivated by real concerns about memory safety)
This is introducing confusion between direct use of unsafe code (unsafe) and indirect use of unsafe in safe abstractions (safe). Of course everything is unsafe deep down, but there’s a difference here. Those internal methods within LinkedList are actually unsafe to call and shouldn’t be used.
Internals are discouraged already, the unsafety is the lesser factor there. You’re technically right, but I’m trying to say it doesn’t mean you shouldn’t use either at all, it’s that you have to handle either carefully. Rust marking unsafe code makes it easier to analyze and handle, Julia started to mark public API for similar purposes. We’re not going to get a safe-Rust linked list anytime soon, so if we need a linked list now, no reason not to handle it carefully. Whether we need a linked list is a different question, often answered “no.”
The Julia runtime is a bit unusual. It is not actually a terribly big or complicated program, and we generally try to move anything we can into Julia.
The code generation part of the runtime uses C++ because that’s really the only first-class API for LLVM. The way it uses C++ has been criticized by real C++ programmers as “C with method calls”, which is entirely accurate and intentional. I suspect that using Rust to interface with LLVM would be a major impediment and cause us to have to wait not only for new LLVM releases but for Rust interfaces to LLVM to catch up to those releases, and of course it’s an extra layer of potential bugs. So I don’t think replacing code generation stuff with Rust would be a win.
Then there’s the basic OS runtime stuff. This is written in C and could more plausibly be implemented in Rust. However, a very large amount of this would have to be unsafe. As I said, it’s a very unusual program: it inherently needs to do a lot of unsafe (in the Rust sense) low level memory manipulation and does very little dynamic memory allocation that isn’t subsequently managed by Julia’s own GC. There’s a very small amount of concurrent data structure work, but it’s pretty minimal and unlikely to grow or change too much. Rust could maybe help there, but it seems better to keep the runtime as simple and lowest common denominator as possible, which favors C.
Thanks Stefan - I really appreciate you taking the time to think this through and share a thorough explanation. Your reasoning about simplicity makes perfect sense, especially given Julia’s philosophy of moving more into Julia itself. The MMTk work seems like an interesting exploration in this space and I’m excited to see that progress.