Why are duplicate entries in sparse matrices allowed?

mirestrepo · February 17, 2017, 10:06pm

Why does the representation of sparse matrices allow for duplicates?

a = spzeros(2, 3)
a[[1,1],2]=1
a[[1],2]=2

Leads to

2×3 sparse matrix with 2 Float64 nonzero entries:
	[1, 2]  =  2.0
	[1, 2]  =  1.0

This behavior seems prone to computation errors. Why is this allowed? What is the use case?

mauro3 · February 17, 2017, 10:11pm

just writes a 2 into a slot where a 1 was before. This is normal for any array, no?

mirestrepo · February 17, 2017, 10:14pm

yes - but the internal representation of sparse matrix is now keeping two values for the same entry

2×3 sparse matrix with 2 Float64 nonzero entries:
	[1, 2]  =  2.0
	[1, 2]  =  1.0

kristoffer.carlsson · February 17, 2017, 10:15pm

julia> a = spzeros(2, 3)
2×3 SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> a[[1,1,1,1,1,1],2]=1
1

julia> a
2×3 SparseMatrixCSC{Float64,Int64} with 6 stored entries:
  [1, 2]  =  1.0
  [1, 2]  =  1.0
  [1, 2]  =  1.0
  [1, 2]  =  1.0
  [1, 2]  =  1.0
  [1, 2]  =  1.0

Looks likes a bug to me. It should remove duplicates.

mauro3 · February 17, 2017, 10:21pm

Ah, yes, you’re right! Please file an issue (if there is non yet).

mirestrepo · February 17, 2017, 10:23pm

I googled a little and it seems python and matlab do the same… I’m not sure if there are performance issues with removing duplicates. But allowing duplicates can cause many unwanted bugs in one’s code. Since this is not allowed in full matrices, I see no reason why it would make sense in sparse.

Sparse matrices are part of base… Is the main JuliaLang repo the place to file an issue?

I just wanted to make sure, it’s in fact an issue and not a know behavior

kristoffer.carlsson · February 17, 2017, 10:26pm

It is definitely a bug since other methods will compute incorrectly on this malformed matrix:

julia> a
2×3 SparseMatrixCSC{Float64,Int64} with 4 stored entries:
  [1, 2]  =  2.0
  [2, 2]  =  2.0
  [2, 2]  =  2.0
  [2, 2]  =  2.0

julia> countnz(a)
4

Yes, the julia main repo is the correct place to report the issue.

dorn-gerhard · February 1, 2023, 4:26pm

Maybe this relates to duplicates in sparse matrices - how they are treated now:

Per default they are summed up

julia> A = sparse([2,2,2,2], [1,1,1,1], [2,1,3,4])
2×1 SparseMatrixCSC{Int64, Int64} with 1 stored entry:
  ⋅
 10

or aggregated using the function chosen as combine argument:

julia> A = sparse([2,2,2,2], [1,1,1,1], [2,1,3,4], 2, 1, max)
2×1 SparseMatrixCSC{Int64, Int64} with 1 stored entry:
 ⋅
 4

if only first (or last) argument shall be used use (a,b) -> a or (a,b), -> b

julia> A = sparse([2,2,2,2], [1,1,1,1], [2,1,3,4], 2, 1, (a,b) -> a)
2×1 SparseMatrixCSC{Int64, Int64} with 1 stored entry:
 ⋅
 2

Topic		Replies	Views
Weird duplicated entries with different values in SparseMatrixCSC General Usage question	9	928	April 8, 2021
How to efficiently construct a large SparseArray? Packages for this? Performance package , performance , parallel , sparse	20	1757	May 15, 2022
Fast conversion from Matrix{Union{Missing, Float64}} to Sparse Performance question , sparse	11	497	January 2, 2024
Allocating column of sparseArrays adds explicit zeros Performance	8	530	March 19, 2019
Bug in sparse matrix .+= scalar? Internals & Design	7	807	August 1, 2017

Why are duplicate entries in sparse matrices allowed?

Related topics