Probabilistic bounds-checking idea, to retain more (probabilistic) safety, and retaining most of the speed

First, I like Julia’s approach to bounds checking, it seems ideal, better than most, in fact all, languages I know of.

My own idea: @inbounds could be relaxed to check say every 4th loop iteration (configurable by the user?), to lessen overhead. Your code is still less safe, but you retain at least some safety. Seemingly the code would expand, but you may have expansion anyway, it’s common for compilers to unroll loops.

Background: Julia is mostly not made to be a safe language, while it is by default regarding bounds checks (but not e.g. overflows). And then you can to off locally with @inbounds by the programmer (or globally, by the user, or force always on despite @inbounds).

If you do turn of bounds checking (globally, or even in just one place) locally in your program, the program in no longer safe, i.e. it depends on the analysis of the programmer being right (and possibly inputs to the program).

Then the best you can hope for is a crash, rather than incorrect calculations.

I got the idea, while reading (and answering at) this thread on Rust (a language made to be safe above all, safer, really than Java):

See also:

we propose CHOP, a Convex Hull Optimization based framework, for bypassing redundant memory bounds checking via profile-guided inferences.

I think we can actually do one step better. Bounds checks are mostly free unless they prevent vectorization, so simply allowing the compiler to switch the order of bounds errors would allow for most of the required speedup. (Also doing things like checking every 4th element strictly would mean that you have a roughly 1 in 4 chance of missing common errors like off by 1)

4 Likes

Yeah. One of my plans for the new LoopVectorization would be to have the semantics be that if code throws an error, it is allowed to move that error earlier in the program.
Thus, it would be allowed to hoist bounds checks out of and in front of loops, checking them (and possibly throwing an error) before entering the loop.

This would only work in cases where bounds are inferable, e.g. affine loop nests that can be represented as a polyhedra.
It would work less well for loops like

for i in 1:N
    for j in 1:innerlooplengths[i]
        A[j,i]
    end
end

but you could still hoist the checks out of the j loop.

Things like LUTs would be harder to infer the range for, plus even if you do infer the range, it doesn’t mean it’ll actually index the full range, so you could throw a BoundsError() incorrectly, meaning I don’t see a way to do this correctly for LUT-like indexing at the moment.

1 Like