BoundsError in CartesianIndices on 32-bit Windows

Expine · April 10, 2024, 2:31pm

Hi!

I experienced Integer overflow on Windows due to the fact that Int32 is the default one there:

length(CartesianIndices((1000000, 1000000)))

Succeeds on linux64 but fails on Windows (something like A[-72232] is out of bounds).

I copied randsubseq! implementation, where the overflow happens, and adjusted it so index vars are Int64 now:

randsubseq(r::AbstractRNG, A::AbstractArray{T}, p::Real) where {T} =
    randsubseq!(r, T[], A, p)

# Fill S (resized as needed) with a random subsequence of A, where
# each element of A is included in S with independent probability p.
function randsubseq!(r::AbstractRNG, S::AbstractArray, A::AbstractArray, p::Real)
    Base.require_one_based_indexing(S, A)
    0 <= p <= 1 || throw(ArgumentError("probability $p not in [0,1]"))
    n = prod(size(A), init=Int64(1))  # length - there the overflow happened originally
    p == 1 && return copyto!(resize!(S, n), A)
    empty!(S)
    p == 0 && return S
    nexpected = p * n
    sizehint!(S, round(Int64, nexpected + 5*sqrt(nexpected)))
    if p > 0.15
        for i = 1:n
            rand(r) <= p && push!(S, A[i])
        end
    else
        L = -1 / log1p(-p)
        i = Int64(0)
        while true
            s = randexp(r) * L
            s >= n - i && return S
            push!(S, A[i += ceil(Int64, s)])
        end
    end
    return S
end

Surprisingly I get:

  BoundsError: attempt to access 1000000×1000000 CartesianIndices{2, Tuple{Base.OneTo{Int32}, Base.OneTo{Int32}}} at index [10504]

on Windows, which doesn’t really make sense. How come I can’t access something that is in bounds?

mbauman · April 10, 2024, 2:48pm

Welcome! So, the reason that CartesianIndices exists is to be used as an index — and on a 32-bit system, you cannot create a thing that has more than the ~2 billion addressable elements. A 1000000x1000000 array has a trillion elements.

What are you trying to do here? I suspect there’s a much better solution than a trillion iterations in the first place.

Expine · April 10, 2024, 3:15pm

So I work with a sparse tensor library Finch.jl.

There’s a method there fsprand, fiber-sparse-rand that returns a COO tensor, given dimensions and density p:

Internally:

fsprand(1_000_000, 1_000_000, 1e-06)

uses CartesianIndices to describe only non-zeros, and calls:

randsubseq(r, CartesianIndices(M), p)

where M=(1_000_000, 1_000_000) and p=1e-06, which fails on Windows.

We have an open PR with some discussion: Wma/fix 488 by willow-ahrens · Pull Request #489 · willow-ahrens/Finch.jl · GitHub

mbauman · April 10, 2024, 3:31pm

Ah, of course, the non-sensical bounds error is because, when you use linear indexing, it’ll check the index against the length. And the length overflows to negative, so everything is going to be OOB.

The crux of it is that arrays (even sparse ones) with more than 2^31 elements are going to be troublesome on 32-bit systems.

Topic		Replies	Views
CuArrays.jl errors with indexing / CartesianIndices General Usage gpu , indexing , cuda	1	590	July 24, 2019
BoundsError: attempt to access 0-element Array{Any,1} at index [1] General Usage	3	3800	July 27, 2021
Type Inference for CartesianIndices(array) fail when the dimension of the array is too large Performance indexing	3	449	August 25, 2019
Bounds error while iterating over for loop General Usage	3	694	September 14, 2022
BoundsError: attempt to access 0-element Array{JuMP.Variable,1} at index [1] General Usage	0	767	December 25, 2019

BoundsError in CartesianIndices on 32-bit Windows

Related topics