Weird duplicated entries with different values in SparseMatrixCSC

Hi, I found a very weird thing when using SparseMatrixCSC:

K is a SparseMatrixCSC created from some complicated process (Finite element analysis application), here I show part of it:

 K = spzeros(ndof, ndof) #ndof : num of degree of freedom

# Assemble stiffness matrix 
 for i = 1:nel # traverse finite element 
    rowIdx = iK[1:8, (i-1)*8+1 : i*8]
    colIdx = jK[:, i]
    K_colptr, K_rowval, K_nzval_idx = build_sparseCSC_input(rowIdx[:], colIdx[:])
    K_nzval = (sK[:][K_nzval_idx])[:]
   # add element stiffness matrix to global 
    K += SparseMatrixCSC(ndof, ndof, K_colptr, K_rowval, K_nzval)
end

build_sparseCSC_input(rowIdx[:], colIdx[:]) is used to transfrom input of sparse to input of SparseMatrixCSC. The content of it I will show at the end.

The result of K is

8×8 SparseMatrixCSC{Float64,Int64} with 151 stored entries:
  [1, 1]  =  0.366667
  [2, 1]  =  0.175
  [3, 1]  =  0.0833333
  [4, 1]  =  0.075
  [5, 1]  =  -0.533333
  [1, 1]  =  0.366667
  ⋮
  [1, 8]  =  -0.175
  [2, 8]  =  -0.183333
  [5, 8]  =  -2.77556e-17
  [6, 8]  =  -0.533333
  [1, 8]  =  0.175
  [2, 8]  =  -0.183333

A weird thing happened: 8×8 sparse matrix has 151 stored entries!

When I show K

@show K 

It gives:

K = 
  [1, 1]  =  0.366667
  [2, 1]  =  0.175 <== weird 
  [3, 1]  =  0.0833333
  [4, 1]  =  0.075
  [5, 1]  =  -0.533333
  [1, 1]  =  0.366667
  [2, 1]  =  -0.175  <== weird 
  ...
  [2, 8]  =  -0.183333  
  [3, 8]  =  0.075  <== weird 
  [4, 8]  =  0.0833333
  [8, 8]  =  1.46667
  [3, 8]  =  -0.15  <== weird 
  [4, 8]  =  0.166667
  [1, 8]  =  -0.175
  [2, 8]  =  -0.183333
  [5, 8]  =  -2.77556e-17
  [6, 8]  =  -0.533333
  [1, 8]  =  0.175
  [2, 8]  =  -0.183333

You can see there are some duplicated entries like [2,1],[2,8],[3,8] with different values!

Is this a bug of Julia SparseMatrixCSC? My Julia version is v"1.5.3"


Appendix: content of build_sparseCSC_input, which is used to transfrom input of sparse to input of SparseMatrixCSC

function build_sparseCSC_input(rowIdx, colIdx)
    n = length(colIdx)
    colptr = [1]
    rowval = Array{Int}([])
    nzval_idx  = Array{Int}([])
    used_flags = zeros(n, 1)
    j = 1
    m = 0 # num of allocated value 

    while(m < n)
        for i = 1:n
            if(used_flags[i] == 0) # if not used 
                col = colIdx[i]
                if(col == j)  # if the value is in the current column
                    push!(nzval_idx, i)
                    push!(rowval, rowIdx[i])
                    m += 1
                    used_flags[i] = 1 # set this entry as used 
                end
            end
        end
        j += 1 # increase j to next column 
        push!(colptr, m+1) #update colptr 
    end
    
    return colptr, rowval, nzval_idx
end

Are you sure that your inputs are well-formed? Are you creating them using the exposed API, or SparseMatrixCSC? Generally, I would recommend using the API, the constructor is internal.

If you are sure that your inputs are valid, then please construct an MWE to demonstrate the problem, as it may indicate a bug.

Hi, thx for your guidance, I have checked the input to SparseMatrixCSC which seems to be correct, and I am using the exposed API of SparseMatrixCSC of

(::Type{SparseMatrixCSC})(m::Integer, n::Integer, colptr::Array{T,1} where T, rowval::Array{T,1} where T, nzval::Array{T,1} where T) in SparseArrays at /usr/local/julia/share/julia/stdlib/v1.5/SparseArrays/src/sparsematrix.jl:35

here is the MWE that can reproduce this bug:

  • test_build_sparseCSC_input is used to demonstrate the correctness of function build_sparseCSC_input,
  • function test1 is where the bug of duplicated entries with different values inside K happens.
using SparseArrays

function build_sparseCSC_input(rowIdx, colIdx)
    n = length(colIdx)
    colptr = [1]
    rowval = Array{Int}([])
    nzval_idx  = Array{Int}([])
    used_flags = zeros(n, 1)
    j = 1
    m = 0 # num of used value 

    while (m < n)
        for i = 1:n
            if (used_flags[i] == 0) # if not used 
                col = colIdx[i]
                if (col == j) # if the value is in the current No.j column 
                    push!(nzval_idx, i)
                    push!(rowval, rowIdx[i])
                    m += 1
                    used_flags[i] = 1 # set this entry as used 
                end
            end
        end
        j += 1 # increase j to next column 
        push!(colptr, m + 1) #update colptr 
    end

    return colptr, rowval, nzval_idx
end

function test_build_sparseCSC_input() # To demonstrate the correctness of build_sparseCSC_input()
    val = [1.0,2.0,3.0,4.0]
    rowIdx = [1, 1, 2, 2]
    colIdx = [1, 2, 1, 2]
    
    colptr, rowval, nzval_idx = build_sparseCSC_input(rowIdx[:], colIdx[:])
    nzval = val[nzval_idx]
    A = SparseMatrixCSC(2, 2, colptr, rowval, nzval)

    @assert Matrix(A) == Matrix(sparse(rowIdx, colIdx, val, 2, 2))
end

function test1()
    
    edofMat = [3 4 7 8 5 6 1 2; 
            1 2 5 6 7 8 3 4; 
            7 8 3 4 1 2 5 6; 
            5 6 1 2 3 4 7 8]
    nel = 4
    ndof = 8
    iK = kron(edofMat, ones(8, 1))';
    jK = kron(edofMat, ones(1, 8))';
    sK = rand(64, 4)

    K = spzeros(ndof, ndof)
    for i = 1:nel
        rowIdx = iK[1:8, (i - 1) * 8 + 1:i * 8][:]
        colIdx = jK[:, i][:]
        K_colptr, K_rowval, K_nzval_idx = build_sparseCSC_input(rowIdx, colIdx)
        K_nzval = sK[K_nzval_idx] 
        K_temp =  SparseMatrixCSC(ndof, ndof, K_colptr, K_rowval, K_nzval)
        
        K = K + K_temp
    end
    
    return K 
end

test_build_sparseCSC_input()
K = test1()

If you have any question about MWE, please tell me. Thx.

The SparseMatrixCSC constructor is pretty low-level and won’t do much input validation (like combining duplicate entries). You should probably use the high-level sparse constructor as suggested by the SparseMatrixCSC docs.

help?> SparseArrays.SparseMatrixCSC

  Matrix type for storing sparse matrices in the Compressed Sparse Column format. 
The standard way of constructing SparseMatrixCSC is through the sparse function. 
See also spzeros, spdiagm and sprand.
help?> sparse

  sparse(I, J, V,[ m, n, combine])


  Create a sparse matrix S of dimensions m x n such that S[I[k], J[k]] = V[k]. The combine function is used to
  combine duplicates. If m and n are not specified, they are set to maximum(I) and maximum(J) respectively. If the
  combine function is not supplied, combine defaults to + unless the elements of V are Booleans in which case combine
  defaults to |. All elements of I must satisfy 1 <= I[k] <= m, and all elements of J must satisfy 1 <= J[k] <= n.
  Numerical zeros in (I, J, V) are retained as structural nonzeros; to drop numerical zeros, use dropzeros!.

  For additional documentation and an expert driver, see SparseArrays.sparse!.
1 Like

I understand that it works for a 2x2 case, but I still think you have a bug in your code.

Adding

        K_temp2 = sparse(Matrix(K_temp))
        @assert K_temp == K_temp2

to the loop will make it fail.

Use the exposed API.

1 Like

Thanks for all your help! @stillyslalom @Tamas_Papp

After checking my codes, I find a bug with SparseMatrixCSC, which is shown below, and sparse doesn’t have this problem. I think this is an internal bug of SparseArrays.

julia> colptr = [1, 9]
2-element Array{Int64,1}: 
 1
 9

julia> rowval = [3, 4, 7, 8, 5, 6, 1, 2]
8-element Array{Int64,1}: 
 3
 4
 7
 8
 5
 6
 1
 2

julia> nzval = [0.16316121444220966, 0.859772711351463, 0.5396728938322661, 0.11042759016283576, 0.9235098869026452, 0.44937767272991924, 0.08224496368154854, 0.6310509691686923]
8-element Array{Float64,1}:
 0.16316121444220966
 0.859772711351463
 0.5396728938322661
 0.11042759016283576
 0.9235098869026452
 0.44937767272991924
 0.08224496368154854
 0.6310509691686923

julia> a = SparseMatrixCSC(8, 1, colptr, rowval, nzval)
8×1 SparseMatrixCSC{Float64,Int64} with 8 stored entries:
  [3, 1]  =  0.163161
  [4, 1]  =  0.859773
  [7, 1]  =  0.539673
  [8, 1]  =  0.110428
  [5, 1]  =  0.92351
  [6, 1]  =  0.449378
  [1, 1]  =  0.082245
  [2, 1]  =  0.631051


julia> a[1,1]
0.0

julia> a[2,1]
0.0 # Which is wrong!!! Should be  0.631051

julia> b  = sparse(Matrix(a))
8×1 SparseMatrixCSC{Float64,Int64} with 8 stored entries:
  [1, 1]  =  0.082245
  [2, 1]  =  0.631051
  [3, 1]  =  0.163161
  [4, 1]  =  0.859773
  [5, 1]  =  0.92351
  [6, 1]  =  0.449378
  [7, 1]  =  0.539673
  [8, 1]  =  0.110428

julia> b[2,1]
0.6310509691686923 # This time is right!

I will try to use sparse.

I still don’t get why you are calling this a bug. Again, SparseMatrixCSC is not meant to be used directly, so there are no guarantees. As its docstring says, use sparse.

https://github.com/JuliaLang/julia/pull/22529 is a PR that would detect this. It would have printed:

`S.rowval` must be sorted for all column ranges
3 Likes

Great, thanks a lot! @kristoffer.carlsson

I see, really appreciate your help! @Tamas_Papp I will follow the document and use sparse.