Make Julia’s Error Codes Even Better Than Elm’s

NickW3101 · September 17, 2022, 4:26pm

Extremely user friendly and helpful for new users

Sukera · September 17, 2022, 4:49pm

I’m confused. Is there meant to be discussion here? A call to action? A “some other language does some thing better, so someone go fix this in julia”?

jar1 · September 17, 2022, 7:23pm

Maybe we can start by reading that post and looking at examples of Julia’s errors and considering how they could be more beginner-friendly.

xs = [1,2,3]
xs[0]

% julia tmp.jl      
ERROR: LoadError: BoundsError: attempt to access 3-element Vector{Int64} at index [0]
Stacktrace:
 [1] getindex(A::Vector{Int64}, i1::Int64)
   @ Base ./array.jl:861
 [2] top-level scope
   @ /tmp/tmp.uyTu9zl540/tmp.jl:2
in expression starting at /tmp/tmp.uyTu9zl540/tmp.jl:2

Rewriting the error:

1 | xs = [1,2,3]
2 | xs[0] 
    ^^^^^

I tried to access a Vector{Int64} at index [0] but the valid indexes are 
[1], [2], [3].

The stack of function calls, from latest to earliest, was:
 [1] getindex(A::Vector{Int64}, i1::Int64)
   @ Base ./array.jl:861
 [2] top-level scope
   @ /tmp/tmp.uyTu9zl540/tmp.jl:2

arrows point at the location in the code, with context. Does julia currently have the ability to isolate code spans so that the ^^^^ arrows can work?
the valid indexes are mentioned
uses full sentences
uses “I” for the compiler, which does have a more relatable tone IMHO
doesn’t introduce “stacktrace” jargon

NickW3101 · September 17, 2022, 7:55pm

Thank you!

giordano · September 18, 2022, 12:35am

I hope you don’t plan to list all valid indices of a 4194304-element vector

jar1 · September 18, 2022, 12:40am

In the case of Vector it could just give the range. Some container types might have holes.

gustaphe · September 19, 2022, 8:03pm

github.com/JuliaLang/julia

[RFC] Add index hint to BoundsError message

JuliaLang:master ← gustaphe:boundshint

opened 10:11PM - 14 Jan 22 UTC

gustaphe

+80 -4

I suggested a more human friendly error message in [this thread on discourse](ht…tps://discourse.julialang.org/t/override-the-error-message-for-avector-0/74306/6?u=gustaphe). Thought I'd try it out -- does this implementation make sense to you? ```julia julia> [1,2,3][4] ERROR: BoundsError: attempt to access 3-element Vector{Int64} at index [4]. Legal indices are 1:3. julia> "åäö"[7] ERROR: BoundsError: attempt to access 6-codeunit String at index [7]. Legal indices are between 1 and 5. julia> rand(5,5)[1,3:6] ERROR: BoundsError: attempt to access 5×5 Matrix{Float64} at index [1, 3:6]. Legal indices are [1:5, 1:5]. ```

jar1 · September 19, 2022, 8:25pm

I un-accepted my answer because I think there is more to talk about than just adding bounds to index errors.

Can Julia isolate code spans so that the ^^^^ arrows can work?

What do you think about a full-sentence policy? What about the explanatory first-person prose that Elm does?

Oscar_Smith · September 19, 2022, 8:45pm

Identifying code spans is definitely a good idea (where possible) and JuliaSyntax.jl (which is in the process of become a standard library) adds this for parser errors.

blackeneth · September 20, 2022, 1:39am

There is a form of error for bad input values; unfortunately, most of the error messages are like this:

ERROR: hey, this function just barfed due to your input value

But what input did it see?
Much more user-friendly to print out the offending value

ERROR: hey, this function just barfed due to your input value of "[res"

Why is this helpful? Sometimes its unclear to the user what was the input. How could this be?

Input is a complex quoted expression
the bad value happens in the middle of a loop feeding 1000’s of inputs into the function
the input is the output of another function, which itself receives its input from yet another function.
the syntax of the function is unclear and the user is trying to figure out what works

It’s often useful to see what the function received versus what the user thought he sent.

Even better if the error gives a reason why the input is wrong

ERROR: bad value for "a" -- your value is -1.0 . "a" must be >= 0

jar1 · September 20, 2022, 2:36am

What about for other kinds of errors, like MethodErrors?

StefanKarpinski · September 20, 2022, 1:18pm

Yes, that is a much better error message. Unfortunately it also tends to make functions much slower even when errors aren’t thrown because it creates a complex branch that captures all values that are used in the error message, which forces those values to be materialized and/or heap allocated. Using a static error message, on the other hand, has negligible performance impact, which is why so many error messages are like that. We have some newish technology to improve this, such as LazyString, but while that can help, it doesn’t entirely eliminate the problem. I’m planning on trying to make some better error messages like you suggest to see how bad the impact is and evaluate what kind of compiler magic we need to make it not affect performance unbearably, but it almost certainly will require some compiler work.

StefanKarpinski · September 20, 2022, 1:43pm

At a higher level, since it may not be clear to people, Julia is in a tricky spot when it comes to error messages. Static languages like Elm have a relatively slow compilation phase which doesn’t affect runtime. Note that the linked page is about compiler errors. The example given here is a runtime error—out of bounds index. (Genuine question: how are Elm’s runtime errors? Does it tell you what invalid index you used?) When an error occurs during compilation, if compilation is a separate step you can basically spend as much time on generating a good error message as you want. It will have no impact on runtime or binary size. Why doesn’t this apply to Julia? Because compilation happens during runtime. We could have more logic in the error paths and there’s active effort to improve syntax errors with the JuliaSyntax project, but you have to be very careful that working harder to make errors better doesn’t make the compiler much slower. And all the code that deals with errors is part of your runtime program. There’s also the issue that less things in the language are compiler errors in the first place—many things that produce errors happen at runtime.

What about dynamic languages? They often give pretty good error messages with runtime values for function arguments. This is very true but they are also traditionally slow. One of the benefits of an interpreter is that code and data are all just data and you have them around when an error occurs. This is very similar to why writing a debugger for an interpreter language implementation is pretty simple. In fact, we have a debugger just like that! But it’s too slow and people complain about that. Guess what happens if we modify Julia to be able to print argument values at every layer of the call stack when an error occurs? Wouldn’t that be an amazing debugging experience? Unfortunately that prevents the vast majority of program optimizations, so it would make Julia almost as slow as an interpreter.

There are fancy technologies to work around this, like DWARF debug information, but we tried that in the Gallium debugger and it was super crashy and unreliable. It turns out that compilers suck at emitting good debug info and LLVM’s JIT compiler barely supports it at all. We could try to fix the debug info that LLVM’s JIT generates but that’s a huge project that we don’t have the capacity for. And of course in languages like C/C++/Rust, people don’t expect to be able to debug release binaries in the debugger—they recompile their programs in a sometimes much slower debug version and then run the debugger on that. We could do something similar in Julia, but when should we generate debug code versus release code? People expect code they evaluate in the REPL to run at maximum speed and complain of it doesn’t go fast.

There are other technologies for getting better stack traces and debug info out of JIT language runtimes, but they’re also hard. V8 and other JavaScript runtimes use dynamic program deoptimization, which means taking optimized compiled code that’s running and rewriting its stack so that it can jump into a non-optimized version of the same code. This is a very fancy technique and especially hard to implement without having a performance impact on the fast path where you don’t deoptimize. It’s possible, but again, we don’t have the capacity in terms of compiler effort.

This is not to say there’s no hope—we can, should, and will improve our error messages. I’m just writing this to point out that it’s not the case that we can “just” do what other languages have done. Static languages and slow dynamic languages are in fundamentally easier positions when it comes to error messages. Julia’s unique combination of extremely high performance (and unforgiving performance-hungry users) with its dynamic nature make this a uniquely hard problem that is a bit of a research project to solve.

gbaraldi · September 20, 2022, 1:58pm

I still think that maybe having some more compile time errors would be nice. Because hipothetically, if we didn’t pay a compile time latency when there were no errors, it taking slighty longer to print a nice error might be a tradeoff people could live with.

StefanKarpinski · September 20, 2022, 1:59pm

That’s what replacing the parser with JuliaSyntax.jl should do: better syntax errors (and faster too!).

gbaraldi · September 20, 2022, 2:02pm

I imagine getting nicer errors for other things requires moving lowering to it too then.

nsajko · September 20, 2022, 3:01pm

This doesn’t make sense, except for syntax errors. Julia already gives you a stack trace, which is exactly what’s necessary.

Error messages shouldn’t be prose, that’s a horrible idea. They should be easy to read and understand quickly.

nilshg · September 20, 2022, 4:02pm

And for syntax errors the future is already there (at least an experimental future):

NickW3101 · September 20, 2022, 6:27pm

Elm actually doesn’t have messages for runtime errors because they claim to never have runtime errors. Dividing by zero, for example, results in infinity instead of an error message.

StefanKarpinski · September 20, 2022, 6:28pm

Do they statically prove that all array indexing is inbounds?

Topic		Replies	Views
Human readable error messages General Usage proposal	16	4432	October 8, 2022
Which getindex expression threw BoundsError General Usage error-message	4	435	January 28, 2024
Hide long types in error output for better readability Internals & Design question , proposal , error	26	2625	February 5, 2020
Create concious compiler messages Internals & Design	4	555	September 27, 2023
Idea Julia Lite or 'Juliette'? New to Julia	155	5393	February 4, 2020

Make Julia’s Error Codes Even Better Than Elm’s

Related topics