Hi,

Is it possible to use `mul!`

with a CUDA sparse array and CuVectors using DoubleFloats.jl?

The following code leads to scalar indexing:

```
using CUDA, SparseArrays, LinearAlgebra, DoubleFloats
CUDA.allowscalar(false)
T = Double64
a = CUDA.zeros(T, 10)
b = CUDA.ones(T, 10)
A = sprand(T, 10, 10, 0.2)
Agpu = CUDA.CUSPARSE.CuSparseMatrixCSC(A)
mul!(a, Agpu, b) # error
```

However using a dense array works fine:

```
A = CUDA.ones(T, 10, 10)
mul!(a, A, b)
```