Unification of `SparseMatrixCSC` and `SparseVector`

Are there plans (proposals, PRs, discussions) for unifying SparseMatrixCSC and SparseVector? Something like a new type SparseArrayCSC{Tv,Ti,D} where D is the dimension (1 for SparseVector, 2 for SparseMatrixCSC). It seems like there is a large quantity of repeated code between these two types (as SparseVector is very close to an Mx1 SparseMatrixCSC). Bonus point is that it would provide D>2 types for free.

While it might be possible to share more code between SparseVector and SparseMatrixCSC, I don’t think it will give D>2 “for free” (since all code is written for either D=1 or D = 2).

Maybe I’m too optimistic, but it seems reasonable to me to imagine that code that is general enough for both D=1 and D=2 would work also for D>2. My naive idea would be to think about general D as a D-1 dimension collection of sparse columns. So essentially it is only the iteration over colptr that needs to change. I think that the easiest would be to have S.colptr::Vector{Ti} with length(S.colptr) == prod(S.dims[2:end])+1 and size(S) = S.dims. (alternatively, S.colptr could be an Array of dimension D-1 but this has some drawbacks). But I don’t know if any of this has been already been discussed (and/or discarded!) and if there is potential interest in this.

It’s something that I’ve considered this in the past, too: it’s effectively embedding a reshape into the CSC structure itself. I’ve tried to find our previous discussions but came up empty — perhaps it was on Slack. It’d work for some basic structural functions, but I think even the general indexing implementation wouldn’t generalize effectively.

Changing the structure of SparseMatrixCSC would be breaking, but adding general CSC-like-property-accessors would be a feature: https://github.com/JuliaLang/julia/issues/26613. It’d then be easy to add those definitions for the compatible reshape(::SparseMatrixCSC,...)s.

2 Likes

Thanks @mbauman for the explanation. I’m not sure I understand though, you propose to add another type, instead of unifying those two?