One of Julia’s main purposes is to enable high-performance computing at scale - it’s on the front page (Ecosystem → Parallel Computing at The Julia Programming Language)! However, apparently --check-bounds=no
is becoming ‘defunct’ (Performance regression up to 20% when updating from Julia v1.10.4 to v1.11.0-rc1 · Issue #55009 · JuliaLang/julia · GitHub, Segfault when creating sysimage with `--check-bounds=no` under Julia-1.11 · Issue #1021 · JuliaLang/PackageCompiler.jl · GitHub).
Sorry this post is a bit long. Where I have seen discussion of --check-bounds=no
recently, I feel like people who use and want --check-bounds=no
are being told that they’re just wrong and shouldn’t use it, so I want to defend this particular use case in detail.
Issues:
- Many scientific HPC codes have a small number of developers (1-5), who are primarily domain-specialist scientists (often PhD students with a lot else to learn already), not programmers or computer scientists.
- HPC codes can routinely use millions or hundreds of millions of CPU hours per run, with a significant cost in both money and carbon emissions.
--check-bounds=no
was a major selling point, as it makes/made it easy to develop and test code in a safe way, but remove the cost of bounds-checking in large-scale runs. Even a 10% cost of bounds checking is significant, and it could be more like 50%.
A common use-case in this scientific-HPC domain is to time-evolve some system of PDEs. In that case we will be repeating millions or hundreds of millions of times identical operations, just with different data in the arrays, while all the array indexing is identical. It is insane to pay $100,000s to bounds check repeated, identical (apart from numerical values) operations. For this kind of operation, if we check the correctness of the indexing on a reasonably large number of small grids, we can be confident enough that it is correct when the only difference is that we use more grid points and/or more timesteps. An ideal solution might be to bounds-check only on the first few timesteps, then disable it, but that’s probably too much trouble to implement, and we can do that manually by just doing a short run with bounds-checking (admittedly probably only done after we notice a problem).
The suggested solution seems to be ‘use @inbounds where it is important’. Say 50% of my code is for setup of the problem, and 50% calculates the time-derivative. Then 50% of code is essentially ‘in the hot loop’, and this is the part most likely to be under active development. To expect everyone to carefully think about where to put @inbounds
every time they write new code is not reasonable for the kind of project teams we have.
The argument against --check-bounds=no
(and even against @inbounds
), if I’ve understood it, is that const-propagation and similar optimizations that the compiler might be able to do are harder without bounds checking. If we accept that out of bounds access in undefined behaviour, which might cause subtly wrong results, segfaults (or even set fire to your computer!), why should there be a problem? The compiler has been told that it is allowed to assume that there is no out-of-bounds access, and optimize on that basis. This is standard practice in HPC. If that is too unsafe for many/most domains where Julia is used, fine, but could we not have something like --check-bounds=unsafe
for HPC where the user genuinely takes responsibility for ensuring that there are no out-of-bounds array accesses, on pain of undefined behaviour? I’m not a compiler developer, so I have no idea how much work this is, but isn’t it just a short-cut in some places that the compiler does not have to try to prove inbounds-ness, which seems (to my uninformed mind) fairly simple?