Transfer a Sparse Matrix to a C Function

I have a C function (See IncompleteCholeskyDecomposition.c) which accepts a Sparse Matrix in CSC format (Actually 2 matrices).

I wonder, how do I extract the data, row index and column pointer from a given Julia Sparse Matrix?
I want this to be efficient so I also wonder if I can allocate a matrix with no values, get the pointers and edit the values and indices.

Is there an example for that?

This is actually pretty easy because the data, row indices, and column pointers are already stored as plain arrays inside a Julia SparseMatrixCSC. You just need to convert them to the appropriate integer and floating-point types (if your C code uses something different) and subtract 1 from all the indices. The OSQP and SCS optimization wrappers do this internally:

2 Likes

Note that the manual is quite explicit and clear on this one (no need to dig into the implementation):

https://docs.julialang.org/en/v1/stdlib/SparseArrays/#man-csc

1 Like

Won’t those conversions make a copy?
What I want is to access the values in C as they are and update them.
I will handle the 1 vs. 0 in the C code.

It is assumed the Julia matrix is either Float32 or Float64.

struct SparseMatrixCSC{Tv,Ti<:Integer} <: AbstractSparseMatrixCSC{Tv,Ti}
    m::Int                  # Number of rows
    n::Int                  # Number of columns
    colptr::Vector{Ti}      # Column j is in colptr[j]:(colptr[j+1]-1)
    rowval::Vector{Ti}      # Row indices of stored values
    nzval::Vector{Tv}       # Stored values, typically nonzeros
end

m and n are the dimensions. Ti is an integer type to handle indexes, and Tv may be of any type, according to the type of data. The pointers of the vectors of this structure can be passed to a C routine without copying.

I guess my question is exactly how to create this call with the pointers to those vectors.

They are just Julia arrays, and then you can use:

https://docs.julialang.org/en/v1/manual/calling-c-and-fortran-code/

with a good understanding of C you probably will undertand that section relatively easy (which is not my case, by the way).

Yes, they will make a copy, since the matrix is passed only once from julia → SCS and SCS wants 0-based indices;

If you want to handle it on your own just make sure the struct layout is compatible with what C library expects and pass it through Ref as @lmiq suggested

There are 2 optimizations I’m after:

  1. How to send the C function a pointer to colptr, rowval and nzval.
  2. How to make my whole Julia setup work with 32 Bit indices for the sparse matrix (See Convert a Current Installation of Julia to Use `BlasInt = `Int32`).

In (2) the problem I have is that most non trivial algorithms on Sparse Matrix (Decomposition) will gain from having indices in 32 Bit yet I don’t know about the other eco system (MKL Sparse, Suite Sparse, etc…). If I set my matrix to be 32 Bit, does it mean any call to a function of those libraries will create a conversion step?

I hope we have / we’ll have a way to configure all to be coherent. At the moment I don’t see a way to set MKL, MKL Sparse, Pardiso and other Intel based libraries to use LP64 mode (Some has but it will go away with Julia 1.7). I have no idea about SuiteSparse. It seems some of the algorithms support 32 Bit indices but it is not system wide.

M = SparseMatrixCSC{Float64, Int32}(rand(4,3));
ccall(
    (:function_name, :libname),
    ReturnType, 
    (Clong, Clong, Ref{Cint}, Ref{Cint}, Ref{Cdouble}),
    M.m, # nrows
    M.n, # ncols
    M.colptr,
    M.rowval,
    M.nzval,
)

Ref here just indicates that memory is managed by julia;

We compile SCS_GPU with Int32s since that’s what’s supported/required by cudaBLAS. What doesn’t work (in julia) with Int32 indices?

1 Like

Let’s say I build a sparse matrix with Int32 indices. For example:

numRows = 10_000;
numNnz 	= 1_000;
mA 		= sparse(rand(Int32.(1:numRows), numNnz), rand(Int32.(1:numRows), numNnz), rand(numNnz), numRows, numRows);

Now, assume I use MKLSparse or any other C / Fortran library which has option to have indices wither as Int32 or Int64. Currently, Julia use the interface which is defined by BlasInt. On 64 bit systems it is set to Int64.

So when I call: mA * vA where vA is a dense vector behind the scene Julia converts my sparse matrix to Int64. Not only the conversion takes time, the whole indexing arithmetic is now done with Int64 which means performance is degraded.

Think of algorithms like Factorization / Decomposition which are huge on indices arithmetic. They get slower because of that.

I wish that with the all Trampoline overhaul in Julia 1.7.x someone would also take care of this. There is no reason not to have access to both kind of implementations. For instance, on MKL it means to have both LP64 and ILP64 exposed and used according to the user defined matrix.

Did you see the JuliaCon talk? They cover exactly this: Runtime-switchable BLAS/LAPACK backends via libblastrampoline | E Saba, M Giordano | JuliaCon2021 - YouTube (skip to minute 6 for a demo with 32-bit OpenBLAS and MKL).

1 Like

I even remember watching the talk during juliacon, but somehow forgot that this part was featured there!

1 Like

Yes. I have watched it. Though what I’m talking is mainly for the Sparse Matrices.
Namely, what I’m asking is if both paths exist at once and if Julia chose the LP path for Sparse Matrices built with Int32.

The code on MKLSparse currently use BlasInt to chose the code path. It doesn’t even look at the matrix itself.
This is what I’m talking about (Though I’d say the 3-4 last comments should appear on Convert a Current Installation of Julia to Use `BlasInt = `Int32`).

The solution has 2 stages:

  1. Allow, at the same time, accessing BLAS / LAPCAK / Sparse BLAS / LAPACK libraries with Int32 and Int64 API.
  2. For Sparse Matrices chose the path based on the type of the indices. Go for LP64 for Sparse Matrices with Int32 / UInt32 indices. Use the ILP64 path for matrices built with Int64 / UInt64. One could even say that the path can have AUTO mode for matrices with less than 2 ^ 31 - 1 elements (Be Sparse or Dense).

It will bring performance and memory optimization for the eco system.

1 Like