Make Julia’s Error Codes Even Better Than Elm’s

At a higher level, since it may not be clear to people, Julia is in a tricky spot when it comes to error messages. Static languages like Elm have a relatively slow compilation phase which doesn’t affect runtime. Note that the linked page is about compiler errors. The example given here is a runtime error—out of bounds index. (Genuine question: how are Elm’s runtime errors? Does it tell you what invalid index you used?) When an error occurs during compilation, if compilation is a separate step you can basically spend as much time on generating a good error message as you want. It will have no impact on runtime or binary size. Why doesn’t this apply to Julia? Because compilation happens during runtime. We could have more logic in the error paths and there’s active effort to improve syntax errors with the JuliaSyntax project, but you have to be very careful that working harder to make errors better doesn’t make the compiler much slower. And all the code that deals with errors is part of your runtime program. There’s also the issue that less things in the language are compiler errors in the first place—many things that produce errors happen at runtime.

What about dynamic languages? They often give pretty good error messages with runtime values for function arguments. This is very true but they are also traditionally slow. One of the benefits of an interpreter is that code and data are all just data and you have them around when an error occurs. This is very similar to why writing a debugger for an interpreter language implementation is pretty simple. In fact, we have a debugger just like that! But it’s too slow and people complain about that. Guess what happens if we modify Julia to be able to print argument values at every layer of the call stack when an error occurs? Wouldn’t that be an amazing debugging experience? Unfortunately that prevents the vast majority of program optimizations, so it would make Julia almost as slow as an interpreter.

There are fancy technologies to work around this, like DWARF debug information, but we tried that in the Gallium debugger and it was super crashy and unreliable. It turns out that compilers suck at emitting good debug info and LLVM’s JIT compiler barely supports it at all. We could try to fix the debug info that LLVM’s JIT generates but that’s a huge project that we don’t have the capacity for. And of course in languages like C/C++/Rust, people don’t expect to be able to debug release binaries in the debugger—they recompile their programs in a sometimes much slower debug version and then run the debugger on that. We could do something similar in Julia, but when should we generate debug code versus release code? People expect code they evaluate in the REPL to run at maximum speed and complain of it doesn’t go fast.

There are other technologies for getting better stack traces and debug info out of JIT language runtimes, but they’re also hard. V8 and other JavaScript runtimes use dynamic program deoptimization, which means taking optimized compiled code that’s running and rewriting its stack so that it can jump into a non-optimized version of the same code. This is a very fancy technique and especially hard to implement without having a performance impact on the fast path where you don’t deoptimize. It’s possible, but again, we don’t have the capacity in terms of compiler effort.

This is not to say there’s no hope—we can, should, and will improve our error messages. I’m just writing this to point out that it’s not the case that we can “just” do what other languages have done. Static languages and slow dynamic languages are in fundamentally easier positions when it comes to error messages. Julia’s unique combination of extremely high performance (and unforgiving performance-hungry users) with its dynamic nature make this a uniquely hard problem that is a bit of a research project to solve.

68 Likes