Scalar indexing is disallowed

Can you tell me why I am getting it?

ERROR: LoadError: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.

in

function scale!(A::SparseArrays.SparseMatrixCSC{Float64,Int}, b::Vector{Float64})
    if CUDA.functional()
        A_gpu = CuSparseMatrixCSC(A)
        row_norms_gpu = sqrt.(sum(abs2, A_gpu, dims=2))
        scaling_factors_gpu = CUDA.ones(size(row_norms_gpu)) ./ row_norms_gpu
        scaling_factors = Array(scaling_factors_gpu) 
    else
        scaling_factors = 1.0 ./ sqrt.(sum(abs2, A, dims=2))
    end
    A .*= scaling_factors
    b .*= scaling_factors 
end

Thanks

I cannot install CuArrays, and cannot pinpoint the error. But can’t you start reducing the code until you find the problematic line.

First, remove the else clause. Then just write a function that does

A_gpu = CuSparseMatrixCSC(A)
row_norms_gpu = sqrt.(sum(abs2, A_gpu, dims=2))

etc.

BTW, this

scaling_factors_gpu = CUDA.ones(size(row_norms_gpu)) ./ row_norms_gpu

creates an unnecessary array of ones, just to use it for division. Instead just use the generic function inv, like this:

scaling_factors_gpu = inv.(row_norms_gpu)
1 Like

Look at the backtrace to find out which operation does not work properly, and try to avoid it. See Workflow · CUDA.jl for details on scalar indexing.

The error seems to be from the line

scaling_factors_gpu = CUDA.ones(size(row_norms_gpu)) ./ row_norms_gpu

where row_norms_gpu is a normal (non-gpu) array creating some bad interactions. So either changing CUDA.ones(...) to a 1 or wrapping the row_norms_gpu in a CuArray seems to remove the error, but the first case does the calculation on cpu and the second does a dense sum, so neither might be very good if you operate on a very large matrix.

As to why row_norms_gpu is not already a CuArray it seems like the sum for CuSparseMatrixCSC results in a normal Array, while sum(CuArray(A_gpu), dims=2) results in a CuArray. Not sure what I would expect here (CuSparse or CuArray), but probably something that was still on the GPU at least.

No matter how the CuSparse issue is solved, I believe one should use inv.() instead of 1 ./ ().