Make Julia’s Error Codes Even Better Than Elm’s

Extremely user friendly and helpful for new users

I’m confused. Is there meant to be discussion here? A call to action? A “some other language does some thing better, so someone go fix this in julia”?

4 Likes

Maybe we can start by reading that post and looking at examples of Julia’s errors and considering how they could be more beginner-friendly.

xs = [1,2,3]
xs[0]
% julia tmp.jl      
ERROR: LoadError: BoundsError: attempt to access 3-element Vector{Int64} at index [0]
Stacktrace:
 [1] getindex(A::Vector{Int64}, i1::Int64)
   @ Base ./array.jl:861
 [2] top-level scope
   @ /tmp/tmp.uyTu9zl540/tmp.jl:2
in expression starting at /tmp/tmp.uyTu9zl540/tmp.jl:2

Rewriting the error:

1 | xs = [1,2,3]
2 | xs[0] 
    ^^^^^

I tried to access a Vector{Int64} at index [0] but the valid indexes are 
[1], [2], [3].

The stack of function calls, from latest to earliest, was:
 [1] getindex(A::Vector{Int64}, i1::Int64)
   @ Base ./array.jl:861
 [2] top-level scope
   @ /tmp/tmp.uyTu9zl540/tmp.jl:2
  • arrows point at the location in the code, with context. Does julia currently have the ability to isolate code spans so that the ^^^^ arrows can work?
  • the valid indexes are mentioned
  • uses full sentences
  • uses “I” for the compiler, which does have a more relatable tone IMHO
  • doesn’t introduce “stacktrace” jargon
10 Likes

Thank you!

I hope you don’t plan to list all valid indices of a 4194304-element vector :wink:

11 Likes

In the case of Vector it could just give the range. Some container types might have holes.

1 Like
5 Likes

I un-accepted my answer because I think there is more to talk about than just adding bounds to index errors.

Can Julia isolate code spans so that the ^^^^ arrows can work?

What do you think about a full-sentence policy? What about the explanatory first-person prose that Elm does?

4 Likes

Identifying code spans is definitely a good idea (where possible) and JuliaSyntax.jl (which is in the process of become a standard library) adds this for parser errors.

11 Likes

There is a form of error for bad input values; unfortunately, most of the error messages are like this:

ERROR: hey, this function just barfed due to your input value

But what input did it see?
Much more user-friendly to print out the offending value

ERROR: hey, this function just barfed due to your input value of "[res"

Why is this helpful? Sometimes its unclear to the user what was the input. How could this be?

  • Input is a complex quoted expression
  • the bad value happens in the middle of a loop feeding 1000’s of inputs into the function
  • the input is the output of another function, which itself receives its input from yet another function.
  • the syntax of the function is unclear and the user is trying to figure out what works

It’s often useful to see what the function received versus what the user thought he sent.

Even better if the error gives a reason why the input is wrong

ERROR: bad value for "a" -- your value is -1.0 . "a" must be >= 0 
8 Likes

What about for other kinds of errors, like MethodErrors?

Yes, that is a much better error message. Unfortunately it also tends to make functions much slower even when errors aren’t thrown because it creates a complex branch that captures all values that are used in the error message, which forces those values to be materialized and/or heap allocated. Using a static error message, on the other hand, has negligible performance impact, which is why so many error messages are like that. We have some newish technology to improve this, such as LazyString, but while that can help, it doesn’t entirely eliminate the problem. I’m planning on trying to make some better error messages like you suggest to see how bad the impact is and evaluate what kind of compiler magic we need to make it not affect performance unbearably, but it almost certainly will require some compiler work.

17 Likes

At a higher level, since it may not be clear to people, Julia is in a tricky spot when it comes to error messages. Static languages like Elm have a relatively slow compilation phase which doesn’t affect runtime. Note that the linked page is about compiler errors. The example given here is a runtime error—out of bounds index. (Genuine question: how are Elm’s runtime errors? Does it tell you what invalid index you used?) When an error occurs during compilation, if compilation is a separate step you can basically spend as much time on generating a good error message as you want. It will have no impact on runtime or binary size. Why doesn’t this apply to Julia? Because compilation happens during runtime. We could have more logic in the error paths and there’s active effort to improve syntax errors with the JuliaSyntax project, but you have to be very careful that working harder to make errors better doesn’t make the compiler much slower. And all the code that deals with errors is part of your runtime program. There’s also the issue that less things in the language are compiler errors in the first place—many things that produce errors happen at runtime.

What about dynamic languages? They often give pretty good error messages with runtime values for function arguments. This is very true but they are also traditionally slow. One of the benefits of an interpreter is that code and data are all just data and you have them around when an error occurs. This is very similar to why writing a debugger for an interpreter language implementation is pretty simple. In fact, we have a debugger just like that! But it’s too slow and people complain about that. Guess what happens if we modify Julia to be able to print argument values at every layer of the call stack when an error occurs? Wouldn’t that be an amazing debugging experience? Unfortunately that prevents the vast majority of program optimizations, so it would make Julia almost as slow as an interpreter.

There are fancy technologies to work around this, like DWARF debug information, but we tried that in the Gallium debugger and it was super crashy and unreliable. It turns out that compilers suck at emitting good debug info and LLVM’s JIT compiler barely supports it at all. We could try to fix the debug info that LLVM’s JIT generates but that’s a huge project that we don’t have the capacity for. And of course in languages like C/C++/Rust, people don’t expect to be able to debug release binaries in the debugger—they recompile their programs in a sometimes much slower debug version and then run the debugger on that. We could do something similar in Julia, but when should we generate debug code versus release code? People expect code they evaluate in the REPL to run at maximum speed and complain of it doesn’t go fast.

There are other technologies for getting better stack traces and debug info out of JIT language runtimes, but they’re also hard. V8 and other JavaScript runtimes use dynamic program deoptimization, which means taking optimized compiled code that’s running and rewriting its stack so that it can jump into a non-optimized version of the same code. This is a very fancy technique and especially hard to implement without having a performance impact on the fast path where you don’t deoptimize. It’s possible, but again, we don’t have the capacity in terms of compiler effort.

This is not to say there’s no hope—we can, should, and will improve our error messages. I’m just writing this to point out that it’s not the case that we can “just” do what other languages have done. Static languages and slow dynamic languages are in fundamentally easier positions when it comes to error messages. Julia’s unique combination of extremely high performance (and unforgiving performance-hungry users) with its dynamic nature make this a uniquely hard problem that is a bit of a research project to solve.

47 Likes

I still think that maybe having some more compile time errors would be nice. Because hipothetically, if we didn’t pay a compile time latency when there were no errors, it taking slighty longer to print a nice error might be a tradeoff people could live with.

1 Like

That’s what replacing the parser with JuliaSyntax.jl should do: better syntax errors (and faster too!).

8 Likes

I imagine getting nicer errors for other things requires moving lowering to it too then.

2 Likes

This doesn’t make sense, except for syntax errors. Julia already gives you a stack trace, which is exactly what’s necessary.

Error messages shouldn’t be prose, that’s a horrible idea. They should be easy to read and understand quickly.

1 Like

And for syntax errors the future is already there (at least an experimental future):

image

12 Likes

Elm actually doesn’t have messages for runtime errors because they claim to never have runtime errors. Dividing by zero, for example, results in infinity instead of an error message.

Do they statically prove that all array indexing is inbounds?