Hi there,

I’m rather new to Julia (and already enjoying it a lot), so please don’t be too hard on me if my topic doesn’t make that much sense. However I think I noticed a somewhat surprising behaviour when `cumsum`

(cumulated sum) is applied to a `SparseVector`

which also leads to a large performance issue in certain cases:

```
using SparseArrays
prob_vec = sprand(10, 0.2)
```

10-element SparseVector{Float64,Int64} with 2 stored entries:

[2 ] = 0.459558

[10] = 0.50192

`res = cumsum(prob_vec)`

10-element SparseVector{Float64,Int64} with 9 stored entries:

[2 ] = 0.459558

[3 ] = 0.459558

[4 ] = 0.459558

⋮

[7 ] = 0.459558

[8 ] = 0.459558

[9 ] = 0.459558

[10] = 0.961478

So once `cumsum`

found the first non-zero entry all subsequent indices are not sparse anymore. I understand that this has to happen mathematically, in case you want to check in-between indices like 7. But it might be misleading for some users to call `res`

a `SparseVector`

, letting them believe res is still sparse and fast in subsequent operations.

In my specific case `prob_vec`

is a probability vector (sum(prob_vec)=1) and I was interested in

`sel_ind = findfirst(cumsum(prob_vec).>rand())`

which was extremely slow compared to (factor 1000 difference on a large `prob_vec`

)

```
prob_vec_inds, prob_vec_vals = findnz(prob_vec)
prob_cum = cumsum(prob_vec_vals)
sel_ind = prob_vec_inds[findfirst(prob_cum.>rand())]
```

due to the behaviour of `cumsum`

on large `SparseVectors`

. Maybe there is a way to notify users about this behaviour?

Best