Which getindex expression threw BoundsError

jar1 · January 27, 2024, 7:37am

I often make this kind of mistake. Would it be possible for the error to figure out which subexpression has the problem, to save me time debugging?

julia> let n = 2, m = 3
           xs = randn(n)
           ys = randn(m)
           xs[m] + ys[n]
       end
ERROR: BoundsError: attempt to access 2-element Vector{Float64} at index [3]
Stacktrace:
 [1] getindex(A::Vector{Float64}, i1::Int64)
   @ Base ./essentials.jl:13
 [2] top-level scope

Sukera · January 27, 2024, 7:49am

Unfortunately, lowering doesn’t give enough information here. Only the source line is preserved, not the textual column or subexpression.

Having that kind of information would be very beneficial, not just for better identifying the source of errors. Some examples can be found here:

github.com/JuliaLang/julia

[FR] A wishlist for code coverage features

opened 02:57PM - 08 Aug 23 UTC

Seelengrab

design feature

# Code Coverage Code coverage is a very interesting field, with lots of diffe…rent variations. The following is loosely inspired by some literature I stumbled across years ago - I'll have to see if I can find it again. The ideas are definitely not my own, though I believe the julia specific application is. **Note:** This issue was originally a comment in [this](https://github.com/JuliaLang/julia/issues/49978#issuecomment-1567922456) issue, but seeing as this is a more general wishlist/thing that would be nice to have, I thought it would be a good idea to track this seperately. This issue is not exactly the same as the linked comment though, so please do read through this thoroughly! If you think this is more appropriate as a discussion instead of an issue, please do feel free to convert this issue into one. ## Motivation/Background The larger motivation for this is in part that code coverage in Julia has historically had quite a lot of issues, ranging from tracking wrong source lines, over reporting wrong coverage even though a line must have been passed through to changing effects of called functions and inhibiting optimizations. This seems unnecessary, and especially problematic when looking at other languages and how well these kinds of software engineering tools help inform & establish good development practices. Generally, each of the following list has either full, partial, or no coverage. Full coverage means "every variation of this is covered by some test code", partial coverage means "some variation of this is covered by some test code" and no coverage means "no variation of this is covered by some test code". * Line coverage * Is this line of source code covered by some test/actual execution? * Expression coverage * Is this expression (and its subexpressions) covered by some test/actual execution? * Branch coverage * Is this branch covered by some test/actual execution? * Path coverage * Is a given path through the program, from call to return, covered by some test/execution? * This is notably different from branch coverage - there are situation where you can test each individual branch in isolation and reach full branch coverage, but don't have full path coverage, which can hide problems. For example, think of a function with two `if`/`else` blocks following one another, without being nested. You can reach full branch coverage with just 2 inputs, but path coverage requires at least 4 in the general case, hitting every combination of `if` and `else` block. * This is usually sufficient and can be reached by segmenting the input/output space into equivalence classes for the purpose of testing. * Input/Output space coverage * Is a given subregion of the full input argument space covered by some test/execution? * Reaching full input/output space coverage shouldn't be the goal of any serious software project; that's just wasting CPU cycles (you can do it on some small functions taking a small number of arguments, like a single `UInt32` for example) * This is mostly achievable for small types and types with fixed number of known values, like enums. The list is ordered from "least trustworthy" to "most trustworthy", in terms of implications about correctness of the tested piece of code. If you have full coverage (I think sometimes also called "total" in the literature, but don't quote me on that) on one level of the list, it means that you must also have full coverage on the levels above. Partial coverage only implies partial coverage up the list (though full coverage is allowed - for example, it's extremely common to only have partial Input/Output space coverage, yet you reach full line coverage trivially). No coverage on any level means that you can't have coverage on any other level here either. Julia currently only reports (fuzzy) line coverage, which is extremely easy to reach with a bit of effort (for example, the package FieldFlags.jl has [100% coverage](https://app.codecov.io/github/Seelengrab/FieldFlags.jl), yet I only consider it moderately well tested because Julia doesn't report functions that aren't called at all as uncovered). So from my POV, "making coverage better" means being able to emit more granular information about what _isn't_ covered, as well as actually reporting uncalled functions as uncovered, thereby reducing reported coverage percentages (because we're moving down the list of coverage guarantees). To get some of these levels, our source code needs to at least track columns as well as lines, otherwise we only get a subset of expressions, branches or paths that could be covered (think ternary expressions, for example - line coverage decidedly does not imply expression or branch coverage). The way coverage (on master) works (or at last, as far as I can tell) is by tainting the `:sideeffectfree` effect of any called code, because when tracking coverage, optimizing code by deleting dead branches has the "sideeffect" of changing the observed coverage number. I don't think that should happen; any code that's not actively looking at coverage data must return the same values it did before (assuming the code is otherwise free of sideeffects and RNG and so on). Further, coverage is a diagnostic global property of a program, not a property inherent to a function; tracking it through tainting of `:sideeffectfree` (or even a different effect) clashes with the meaning of an effect. Here's an example: ```julia h(x) = iseven(x) ? "string" : 1 g() = h(2) ``` A priori, asking whether any part of this code is "covered" doesn't really make sense - there is no call to either `g` or `h` at the top level, and nothing is executed. So nothing can be covered because nothing runs that could produce coverage data. There is an argument to be made that the compiler is free to evaluate `h(2)` at compile time and thus the question is, should it produce coverage data for that call? My answer to that is "yes, but it doesn't necessarily need to report that as true coverage of either `g` or `h`, because no call causing the `h(2)` call actually occured in the (nonexistent) testsuite". The thinking is this: since there is no top level call to `g()`, whether or not `h(2)` is covered is unknown - the compiler can already compute the coverage data during constant folding (it found a valid path to do that in the first place after all), but it must not report it to the user as "this branch is covered", because that would leak the implementation detail/optimization of constant folding. One solution is to of course disable constant folding when emitting coverage, but that then implies the issue about whether or not constant folding is part of observable API at some point, and how to test for that if it's actually required for some other reason (like compiling away an abstraction to build a nice API, which julia is _extremely_ good at - it would be a shame not to be able to test for that in the future!). On top of this, disabling compiler optimizations in the presence of coverage tracking means that CI runs (which by default do enable coverage) must run for much longer (or twice, once with coverage and once without), simply because the code may have been engineered with performance & optimizations in mind. Of course, code that *is* actively looking at coverage data should have its `:consistent` effect tainted; its behavior obviously must change as soon as the coverage data of the inspected function changes. ## Potential solutions Another solution is to keep track of coverage data (in however granular form we want to [1]), but only "merge" it with the parent call when actually requested/a top level call is done that inquires about the coverage of its subexpressions. An additional benefit of this approach is that we can ask the opposite question - where do I need to write more tests to increase coverage, by making top level calls that would cover more subexpressions that are not yet covered in some capacity by another call? I think this can actually already work with abstract interpretation by keeping track of covered line data through calls & expressions, bubbling coverage data upwards. A function is fully covered if all its lines/subexpressions are covered - if some are covered it's partially covered and if none it's not covered (you get the idea). Admittedly I don't have an implementation of that idea. The difficulty of course is that there's quite a lot of additional metadata one would like to get access to when asking about coverage - it's not impossible to think that Julia could directly produce a "coverage percentage" (how bad of a metric that is) itself, using that method. Nonetheless, these potential solutions are only conceptual at the moment, and may not be appropriate/a good way to implement a good coverage mechanism. ## The Wishlist So, in summary, these are the kinds of features I'd love to have with better coverage: * `Expr` based coverage reporting * Coverage reporting not influencing optimizations like dead code elimination, inlining, constant folding, semi-concrete evaluation etc. * Not tainting effects that are unrelated to code coverage * Exposing an API for querying coverage data dynamically * A clearer distinction between `Not covered`/`partially covered`/`uncovered`, as we currently only have a binary designator for that * Coverage reporting of functions that are not called in the codebase * Code coverage of code generated by macros * This is likely very tricky, because a macro could produce entirely different code at every invocation. There would have to be some kind of double mapping/tracking of where an expression comes from, beyond just LineNumberNodes. And these are the kinds of things these features would enable: * Coverage guided fuzzing, e.g. with PropCheck.jl * PropCheck.jl currently just generates 100 test cases and then stops checking for new ones. Instead, there could be a mode that generates test cases until the function being fuzzed is sufficiently covered, providing much better fuzzing capabilities than a "blind shot" currently does. * Coverage reporting for guiding which functions need more tests * Similarly to what PropCheck.jl could do, but for reporting which branches or even value regions are currently uncovered by tests. See e.g. https://ieeexplore.ieee.org/stampPDF/getPDF.jsp?tp=&arnumber=9152719 for a version of coverage guided fuzzers * Pretty visualizations! * [Better error reporting](https://discourse.julialang.org/t/which-getindex-expression-threw-boundserror/109329) * With dynamic coverage querying, it ought to be possible to have the following flow (or something like it, depending on how granular the data we get is): ```julia julia> f(x::Int) = iseven(x) ? "foo" : "bar" julia> coverage(f, (Int,), :line) :uncovered # and similar for all other variants of coverage julia> f(2) "foo" julia> coverage(f, (Int,), :line) :full julia> coverage(f, (Int,), :expr) :partial julia> f(1) "bar" julia> coverage(f, (Int,), :expr) :full julia> coverage(f, (Int,), :value) :partial # because we can't claim full coverage if we only test two values out of all `Int` ``` #### Others There may be others that arise in a discussion in the comments; I will add them to the list above. ### Potential problems #### `rand` Part of this issue came out of a long discussion I've had with @oscardssmith on Slack, in particular in regards to examples where tracking coverage is tricky due to cases involving some RNG. Here are some points raised, and how I think they fit into this wishlis: ``` function f() x = y = 1 if rand() < .5 x = 5 elseif x == 1 y = 2 end return x^y end ``` In any given invocation of `f`, either branch could be taken, so any given invocation can only ever report partial coverage. However, that has no bearing on whether e.g. `x^y` is covered; it is always fully (`:line` and `:expr`) covered as soon as any call is made. Further, this function by itself already cannot be constant folded, because `rand()` is not `:consistent`; tracking coverage on top of that shouldn't change that. Nevertheless, I don't think this actually stands in the way of seperating coverage tracking from `:effect_free` tracking, because tracking coverage of `f` has no influence whatsoever on the behavior of `f`. #### Effects of inspecting coverage Inspecting coverage data about a function must certainly taint `:consistent` of the function/code/expression doing the inspecting, at least because the result of coverage can & should change dynamically (as in the REPL example above). However, this can be tricky for something like this: ```julia # s being from `:line` etc. function foo(s::Symbol) coverage(foo, (Symbol,), s) end ``` What should the effects of this function be? Evidently, as soon as you call `foo`, you should always get `:full` coverage back. So it should be `:consistent`. However, before you make that call, the function has no coverage yet, so it cannot return `:full` and thus isn't `:consistent`. Arguably, such edge cases of self-inspection ought to just be disallowed entirely. It may be difficult to achieve that in practice, but since this is a diagnostic query, I think it's fair game to have it be slow. #### Others There may be others that arise in a discussion in the comments; I will add them to the list above. --- [1]: e.g. for constant foldable expressions, Julia must already have perfect knowledge of line, expression, branch, path and input/output space coverage, because it was able to execute the code at compile time! This doesn't necessarily imply that the result of that is full coverage, but does imply partial coverage for all of these.

EDIT: I’ve added this discourse thread as an “this would be enabled by better coverage reporting” entry to the list of cool things.

mnemnion · January 28, 2024, 4:15pm

I’ve run into this kind of problem across many languages. It’s uncommon for stack traces to include column information, so if there’s a complex expression, it can be challenging to figure out which part of it is throwing an error.

The workaround while debugging is to put each part of the expression on its own line, something like this:

let n = 2, m = 3
           xs = randn(n)
           ys = randn(m)
           (xs[m] + 
            ys[n])
       end
end

Obviously you wouldn’t want to leave a program in this state! But this will give xs[m] and ys[n] lines of their own, so the stacktrace of the BoundsError will tell you which one is out-of-bounds.

Benny · January 28, 2024, 5:24pm

A justification for only showing line number is that you often shouldn’t only scrutinize the call that threw the error, whether it’s a meaningful subexpression, the whole line, an important series of lines, a whole scope, or possibly more. As you can tell from the example, xs[m] threw the error because length(xs) == n == 2 but m == 3, so you can get around the error with xs[n]. But xs[n] + ys[n] is unlikely to be the right fix because most of us would notice different indices as a typo sooner if the same indices were intended. The more likely typo is swapping indices, which is fixed by xs[n] + ys[m]. Whichever is the right fix, it’s important to look around the error. Still, you can do that anyways if a column number was reported indicating the erroring call, and that’s not much more to print.

nsajko · January 28, 2024, 6:27pm

Yes, but

This isn’t enough. Assigning xs[m] and ys[n] to variables is. The problem is that, after some compilation stage, Julia doesn’t know that the expression has more than one line. I guess that lowering is to blame here, according to Sukera?

I wonder how relevant this message still is (since then we got JuliaSyntax.jl and LLVM could have gotten some relevant improvements, too?):

Somewhat related issues on Github:

github.com/JuliaLang/julia

Better parsing errors: indicate position with a caret

opened 10:44AM - 01 Nov 19 UTC

timholy

parser error messages

I have some new or newish programmers and am trying to pay attention to their ro…adblocks in learning Julia, so I may occasionally file issues which don't look like my usual fare. Suppose you try to define a variable using a name that starts with a number: ```julia julia> 2x = 22 ERROR: syntax: "2" is not a valid function argument name ``` A new programmer might wonder, why does this talk about a "function argument name"? I haven't called any functions (or so I thought). Python is not great at this, but the differences are interesting: ```python >>> 2x = 22 File "<stdin>", line 1 2x = 22 ^ SyntaxError: invalid syntax >>> ``` "invalid syntax" isn't very helpful, but python partly makes up for this with the caret syntax showing the specific place in the line that causes the trouble. (In this case, disambiguating which 2 is the problematic one.) C (gcc) is much better, though still lacks the key hint that variable names should not start with a number: ```c #include <stdio.h> int main(void) { int 2x = 22; return 0; } ``` ```sh $ gcc c.c c.c: In function ‘main’: c.c:4:9: error: invalid suffix "x" on integer constant int 2x = 22; ^~ c.c:4:9: error: expected identifier or ‘(’ before numeric constant ``` Matlab isn't perfect either, but it goes out of its way to be helpful: ```matlab >> 2x = 22 2x = 22 ↑ Error: Invalid expression. Check for missing multiplication operator, missing or unbalanced delimiters, or other syntax error. To construct matrices, use brackets instead of parentheses. Did you mean: >> x = 22 ``` (To clarify, that `x = 22` is queued up on the REPL.) To me, the biggest takeaway is that most other languages have decided that a caret mark is a crucial component of the parser's error-reporting.

github.com/JuliaLang/julia

Type error within struct reported without actual error line

opened 10:33PM - 22 Nov 20 UTC

paulmelis

error handling error messages

Reduction from a larger struct in which I was specifying the type of the `nb` fi…eld incorrectly: ``` melis@juggle 23:28:~$ cat t.jl struct S field1::Bool field2::String # Error is here nb::Matrix{UInt8, UInt8} end melis@juggle 23:28:~$ julia t.jl ERROR: LoadError: too many parameters for type Stacktrace: [1] top-level scope at /home/melis/t.jl:1 in expression starting at /home/melis/t.jl:1 ``` The error line reported is the start of the struct definition, which is only of limited value. As I'm a julia beginner I don't immediately spot these errors within the struct and need to check for the offending line by commenting several fields and rerunning. Would it be possible to report the actual line involved? This is especially useful when structs start to contain many fields. ``` julia> versioninfo() Julia Version 1.5.2 Commit 539f3ce943* (2020-09-23 23:17 UTC) Platform Info: OS: Linux (x86_64-pc-linux-gnu) CPU: Intel(R) Core(TM) i5-4460 CPU @ 3.20GHz WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-10.0.1 (ORCJIT, haswell) ```

New issue:

github.com/JuliaLang/julia

finer-grained location info for error messages, strack traces: line number, column number

opened 07:19PM - 28 Jan 24 UTC

nsajko

error handling parser lowering error messages

`a.jl`: ```julia 1 + 2 + 3 + error() + 5 ``` `b.jl`: ```julia 1 …+ 2 + 3 + 4 + error() + 5 ``` REPL session: ```julia-repl julia> include("a.jl") # It'd be nice to get a more accurate line number here, i.e., "4" instead of "1" ERROR: LoadError: Stacktrace: [1] error() @ Base ./error.jl:44 [2] top-level scope @ /tmp/asdoi98uz/a.jl:1 [3] include(fname::String) @ Main ./sysimg.jl:38 [4] top-level scope @ REPL[1]:1 in expression starting at /tmp/asdoi98uz/a.jl:1 julia> include("b.jl") # It'd be nice to get a column number range here for the offending call, `error()` ERROR: LoadError: Stacktrace: [1] error() @ Base ./error.jl:44 [2] top-level scope @ /tmp/asdoi98uz/b.jl:1 [3] include(fname::String) @ Main ./sysimg.jl:38 [4] top-level scope @ REPL[2]:1 in expression starting at /tmp/asdoi98uz/b.jl:1 julia> versioninfo() Julia Version 1.11.0-DEV.1389 Commit ecc14ca3888 (2024-01-27 15:45 UTC) Build Info: Official https://julialang.org/ release Platform Info: OS: Linux (x86_64-linux-gnu) CPU: 8 × AMD Ryzen 3 5300U with Radeon Graphics WORD_SIZE: 64 LLVM: libLLVM-16.0.6 (ORCJIT, znver2) Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores) ``` Related issues: #33735, #38531

Topic		Replies	Views
Make Julia’s Error Codes Even Better Than Elm’s Internals & Design	61	7441	September 30, 2023
Human readable error messages General Usage proposal	16	4444	October 8, 2022
Some way to debug syntax errors? General Usage debug , syntax , parser	1	494	August 25, 2022
Unexpected Error. How would I dig deeper? General Usage question	0	363	October 26, 2021
How to debug internal runtime error? General Usage inference , error , makie	4	987	December 23, 2021

Which getindex expression threw BoundsError

Related topics