LoopVectorization.vsum : invalid index: 4 x VectorizationBase.Vec

What’s wrong here?

using LoopVectorization
let n = 10000, k = 3, A = rand(n, k)
    sum(1:n) do i
        A[i,1]
    end # ok
    vsum(1:n) do i
        A[i,1]
    end
    #= 
    ERROR: ArgumentError: invalid index: 4 x VectorizationBase.Vec{4, Int64}
VectorizationBase.Vec{4, Int64}<1, 2, 3, 4>
VectorizationBase.Vec{4, Int64}<5, 6, 7, 8>
VectorizationBase.Vec{4, Int64}<9, 10, 11, 12>
VectorizationBase.Vec{4, Int64}<13, 14, 15, 16> of type VectorizationBase.VecUnroll{3, 4, Int64, VectorizationBase.Vec{4, Int64}}
     =#
end

Julia 1.10.3
LoopVectorization v0.12.171
1 Like

Not an answer to your question, but possibly related strangeness: vsum(i -> A[i], 1:n), which should be equivalent, does run, but yields a (slightly) wrong result.

julia> s = sum(@view(A[:, 1]));  # assumed to be correct

julia> vsum(@view(A[:, 1])) ≈ s
true

julia> vsum(@view(A[1:n])) ≈ s
true

julia> vsum(i->A[i], 1:n) ≈ s
false

julia> vsum(i->A[i, 1], 1:n)
ERROR: ArgumentError: invalid index: 4 x VectorizationBase.Vec{4, Int64}
...