Why don't error messages name offending variables/function returns?

While I’m generally loving the experience of learning and using Julia, a frequent source of frustration is parsing its error messages. Even when I understand the content of the error message and I’m given the offending line, it can still be hard to actually nail down the source of the error because I’m not told what within the line actually caused the offense. Ideally a type conversion error would name the variable that had the wrong type. An array bounds error would name the array I was trying to access incorrectly, and so on.

As an example, I once wracked my brain for half a day trying to figure out why a line of arithmetic was trying to convert an object of type Nothing to Float64, but with 6 variables in the line and no guidance I ended up just using printing typeof() statements until I figured out that I accidentally had a function call return Nothing instead of the intended result. If it had simply said Can't convert my_function(args) of return type Nothing to Float64, I would immediately have known that the issue was with the function result.

Perhaps there’s a limitation of the language here that I’m fundamentally misunderstanding, but I guess just don’t understand why, if Julia can trace errors through a call stack to the specific line of source code, it can’t go one step further and just name the variable or function that’s the problem. If there’s a more nuanced explanation here, please do enlighten me.

The conversion methods that throw the errors do not know anything about the call sites to provide the information you want, and the stack trace does not bother to infer the types of variables or other subexpressions in the offending lines. It’s possible for the call site to throw the error instead, but you’d have to know which variables could have the wrong types to begin with and throw the errors yourself. You can instead use type inference reflection e.g. @code_warntype to spot statically knowable types of subexpressions, though it may not stand out. For example, you’d have to notice the possible +(::Nothing, Float64) at the end because @code_warntype isn’t concerned about thrown errors and missing methods.

julia> foo(x::Float64) = x + ifelse(rand((true, false)), 1.0, nothing)
foo (generic function with 1 method)

julia> @code_warntype foo(1.2)
MethodInstance for foo(::Float64)
  from foo(x::Float64) @ Main REPL[14]:1
Arguments
  #self#::Core.Const(Main.foo)
  x::Float64
Body::Float64
1 ─ %1 = Main.:+::Core.Const(+)
│   %2 = Main.ifelse::Core.Const(ifelse)
│   %3 = Main.rand::Core.Const(rand)
│   %4 = Core.tuple(true, false)::Core.Const((true, false))
│   %5 = (%3)(%4)::Bool
│   %6 = Main.nothing::Core.Const(nothing)
│   %7 = (%2)(%5, 1.0, %6)::Union{Nothing, Float64}
│   %8 = (%1)(x, %7)::Float64
└──      return %8

It is however possible for reflection to sound alarms about statically known errors:

julia> using JET

julia> @report_opt foo(1.2) # spots statically unknown types
No errors detected


julia> @report_call foo(1.2) # looks for more errors from statically known types
═════ 1 possible error found ═════
┌ foo(x::Float64) @ Main ./REPL[14]:1
│ no matching method found `+(::Float64, ::Nothing)` (1/2 union split): (x::Float64 + ifelse(rand(tuple(true, false)::Tuple{Bool, Bool})::Bool, 1.0, nothing)::Union{Nothing, Float64})
└────────────────────

You might be expecting something more obvious like ahead-of-time compiler errors for statically typed languages. Dynamically typed languages with REPLs don’t naturally do that, types don’t need to be known at compile-time, and type errors are allowed to be thrown at runtime like other errors. Although technically possible, runtime errors won’t rerun type inference at every method just to highlight parts of lines.

Try JETLS.jl. It will warn you if variables can be nothing. If you start using it, it will probably give way too many warnings, but if you fix them one by one (or suppress invalid warnings), you get a much better experience in the end. AI can help to fix JETLS warnings, too.

I was actually asking about detailed runtime error info. Basically, say I have an error like this:

julia> a = zeros(Int64, 5)
5-element Vector{Int64}:
 0
 0
 0
 0
 0

julia> b = "silly string"
"silly string"

julia> for i in eachindex(a)
           a[i] = b
       end
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Int64
The function `convert` exists, but no method is defined for this combination of argument types.

Why can’t the MethodError say something like:
object 'b' has type String, which cannot be converted to type Int64 in the assignment statement at REPL[3]:2.

That carries the same info (no method exists to convert a string to an integer) but would also specifically name the variable that’s causing the issue. Likewise, given the following:

julia> a = zeros(Int64, 5);

julia> b = ones(Int64, 50);

julia> for i in 1:50
           a[i] = b[i]
       end
ERROR: BoundsError: attempt to access 5-element Vector{Int64} at index [6]

Why can’t the BoundsError say attempt to access 5-element Vector{Int64} 'a' at index [6]? Then I would immediately know that the array access issue was in a, not b. Perhaps that’s what you were referring to when you said that

but I guess I don’t understand what you mean if that’s the case. Surely if it a) knows the objects associated with given variable names in the source code, b) knows the type of those objects at runtime, and c) can throw errors due to invalid operations on those objects, then it should be trivial to print the name(s) of the object(s) that caused the error, right?

I apologize, as I’m genuinely not trying to be rude or dense, but I really don’t understand why this is difficult to accomplish.

It doesn’t. The call f(a, b) does not inform the method of f that the input objects were assigned to a and b in the call scope, so whatever error f throws cannot name those variables. In many cases there wouldn’t be variables to name, like f(arr[42], 0). If the error is thrown by a nested function call, then the subexpressions would have to be traced across calls.

We’d know the subexpressions after some investigation, and hypothetically the program could know it too if it performed type inference or other static analyses across the stack trace when errors are thrown, but that kind of overhead is undesirable at runtime, especially if it could routinely happen in try-catch. It’s better to reserve static analyses for the statically knowable errors, no need to execute a possibly long program (though unit testing is still recommended to catch things static analysis can’t); I don’t know of anything that traces variable names like you want, though.

Thanks for explaining in more detail—I really do appreciate it. If a program halts when an error is thrown, however, why does the overhead of performing type inference across the stack matter? I get that this would be a problem if it were occurring at every function call in advance of an error being thrown, but if it happened after the error (like an implicit finally), then presumably the cost would be comparatively low.

Ideally we should try to show the offending line start + column start + column end + line end, not just line start.

And then display the offending lines, + 4 lines context, with the offending substring highlighted.

I fear doing that would be quite a bit of work, because afaiu a lot of our debug info is on the “filename + linenumber only” level.

That is a problem that julia shares with a lot of other languages. For example, java tooling is abysmal with that respect (oh, nullpointer exception on line xyz. Yeah, which of these? Am I supposed to reformat my code because you can’t include column numbers? In an age where I can’t buy 4:3 monitors any longer?)

I don’t think we at julia have a real excuse for that issue. Java at least has the excuse of “in the 90’s, paying for column number in debug info was expensive”.