Output sparse matrix to csv

I have a little program that correctly produces a sparse matrix from data. When I call the result:

display(dataOut)

I get the expected:

3×3 SparseMatrixCSC{Float64, Int64} with 3 stored entries:
-1.0 -1.0 ⋅
-1.23 ⋅ ⋅
⋅ ⋅ ⋅

Now I want to export that to a .csv file using

writedlm("Output.csv", dataOut, ",")

…but it fills the blank data with zeros.

Is there any way to avoid this behavior?

[Edit: Just to clarify, I am looking for a csv file output that looks exactly like the SparseMatrix, i.e. with blanks in the empty cells and observing the rows and columns]

See solution in this post.

Thanks for the suggestion.

Using dataframes like this doesn’t produce the correct behavior I’m afraid:

df = DataFrame([dataOut])

CSV.write("Output.csv", df)

I’m not sure if I am doing something wrong here, but I specifically want the data to be “clean”, as it were. So I don’t want a dataframe with any headers or anything else but the non-zero values one value per cell.

Still doing it, I’m afraid. I tried a couple of variations.

When I try your code directly it throws the following:

ERROR: LoadError: ArgumentError: column name :V not found in the data frame; existing most similar names are: :m and :n

Ah, that runs now, not sure what I did wrong.

However, now it is not observing the rows and columns, just outputting everything in one row in the .csv

[Will edit previous posts for formatting, thanks for the heads up]

Ok, could you try this:

using SparseArrays, DelimitedFiles
M = sprand(5, 5, 0.5)
Md = Matrix(M)          # creates dense Matrix{Float64}
Mb = similar(Md,Any)    # similar Matrix{Any}, type Any allows mixing floats with strings
Mb .= Md                # assigns content of Md to Mb
Mb[Md .== 0] .= ""      # bitmatrix Md.==0 provides indices of all 0's
open("spmatrix_blanks.csv", "w") do io
    writedlm(io, Mb, ',')
end
1 Like

Perfect. Thanks!

Now I’m going to have to spend the rest of the day trying to figure out why that works :grinning:

Added some comments to the code, in case it helps.

1 Like

Thanks!

So basically, it’s a dense matrix with blanks instead of zeros?

If so, that might actually be more useful to me in the long run.

Yes, it is a weird dense matrix of Any which was made up just to satisfy your CSV output request. Cannot see any other use for it, honestly speaking.

At the moment I’m just using Excel’s conditional formatting on it, otherwise I have to run a macro to get rid of the zeros on the other end. Eventually this step will probably be redundant if I can get the whole thing to run in its own app, but for now this is great.

Thanks again!

Tables.table(mat) will wrap a matrix in a tables-compatible object for use with CSV.write.

As for zeros versus missing for the sparse entries, I think there is a package for that, but googling hasn’t gotten me there. But if you can make a sparse matrix where the zero-values are missing, you can pass the option missingstring = "." to CSV.write and you will be good.

If you don’t need an ASCII file, perhaps the DataDrop package would work for you?

julia> using DataDrop
   
julia> using SparseArrays  
   
julia> c = sprand(4, 3, 0.5)     
4×3 SparseMatrixCSC{Float64, Int64} with 5 stored entries:     
  ⋅   ⋅   0.780225   
  ⋅  0.0612236   ⋅   
  ⋅   ⋅    ⋅   
 0.262007  0.778962   0.0651033  
   
julia> DataDrop.store_matrix("c.h5", c)   
   
julia> yac = DataDrop.retrieve_matrix("c.h5")   
4×3 SparseMatrixCSC{Float64, Int64} with 5 stored entries:     
  ⋅   ⋅   0.780225   
  ⋅  0.0612236   ⋅   
  ⋅   ⋅    ⋅   
 0.262007  0.778962   0.0651033  

OP is re-opening their file in Excel, so I think this is a bit heavy.

I see. This might be useful later on, but the output file is the same size as the CSV. Is this correct? Is there an advantage to using this format?

I must confess I can’t get that to work.

I’m assuming “mat” is the sparse matrix? So for my example:

Tables.table(dataOut)

?

Yes. Does that work?

You may need to add Tables.jl

Afraid not. It’s throwing:

ERROR: LoadError: ArgumentError: a ‘SparseMatrixCSC{Float64, Int64}’ is not a table;

Works for me

julia> using Tables, SparseArrays, CSV;

julia> A = sparse([1, 1, 2, 3], [1, 3, 2, 3], [0, 1, 2, 0]);

julia> A_table = Tables.table(A);

julia> CSV.write("my_file.csv", A_table);

Ah yes, that would be me getting the syntax wrong again.

Unfortunately the output is giving me headers and zeros still, even after adding missingstring:

CSV.write("my_file.csv", A_table, missingstring = ".");

and with the variation:

CSV.write("my_file.csv", A_table, missingstring = "");