Apple Accelerate Sparse Solvers

I am trying to wrap the calls for Apple’s Accelerate Sparse Solver and it is my first try at wrapping apple routines.

I am following some of the code structure in Accelerate.jl and LinearSolvers.jl but I keep failing at wrapping the SparseMatrixStructure object from Accelerate in Julia.

I am following this docs Creating sparse matrices | Apple Developer Documentation

I am trying this so far

using LinearAlgebra
using Libdl
using SparseArrays

# For now, only use BLAS from Accelerate (that is to say, vecLib)
const global libacc = "/System/Library/Frameworks/Accelerate.framework/Accelerate"
libacc_hdl = Libdl.dlopen_e(libacc)

mat = sprand(10, 10, 0.4)

rowIndices = mat.nzind .- 1
values = mat.nzval
column_starts = mat.colptr .- 1

structure_acc = ccall((:SparseMatrixStructure, libacc), Ptr{Cvoid},
    (Cint, Cint, Ptr{Cint}, Ptr{Cint}, Ptr{Cvoid}, Cint),
    10, 10, column_starts, rowIndices, Ptr{Cvoid}(), 1)

I am following the header located at

/Library/Developer/CommandLineTools/SDKs/MacOSX14.2.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/Headers/Sparse/Solve.h

But ccall isn’t able to find the symbol. I am not sure if I am calling the correct method or loading the correct library.

Any pointers on how to proceed will be appreciated.

2 Likes

This requires a little knowledge of C to understand. There is no function called SparseMatrixStructure.

For Julia, you need to redeclare the struct.

Clang.jl can automate the process of recreating the struct for you.

Thanks for your answer. What I don’t get is why the clang call will produce this method for example

function SparseMultiply(A, X, Y)
    ccall((:SparseMultiply, libSparse), Cvoid, (SparseMatrix_Double, DenseMatrix_Double, DenseMatrix_Double), A, X, Y)
end

and I look for the symbol in the library it can’t find it

libacc ="/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libSparse.dylib"

libacc_hdl = Libdl.dlopen_e(libacc)

dlsym(libacc_hdl, :SparseMultiply)
ERROR: could not load symbol "SparseMultiply":
dlsym(0x7ffe8dc6a948, SparseMultiply): symbol not found
Stacktrace:
 [1] dlsym(hnd::Ptr{Nothing}, s::Symbol; throw_error::Bool)
   @ Base.Libc.Libdl ./libdl.jl:59
 [2] dlsym(hnd::Ptr{Nothing}, s::Symbol)
   @ Base.Libc.Libdl ./libdl.jl:56
 [3] top-level scope
   @ REPL[26]:1

This was the code that I ran with clang

options = load_options(joinpath(@__DIR__, "wrap.toml"))

    args = get_default_args()
    includedir = " /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libSparse.dylib"
    push!(args, "-I$includedir")

    headers = ["/Library/Developer/CommandLineTools/SDKs/MacOSX14.2.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/Headers/Sparse/Solve.h"]

    ctx = create_context(headers, args, options)

    build!(ctx, BUILDSTAGE_NO_PRINTING)

    build!(ctx, BUILDSTAGE_PRINTING_ONLY)

the warp file is

[general]
library_name = "libSparse"
output_file_path = "./libSparse.jl"

[codegen]
always_NUL_terminated_string = true

This was an interesting rabbit hole to go down.

It seems that all the dylibs are now aggregated into some kind of cache as of Big Sur:

  • New in macOS Big Sur 11.0.1, the system ships with a built-in dynamic linker cache of all system-provided libraries. As part of this change, copies of dynamic libraries are no longer present on the filesystem. Code that attempts to check for dynamic library presence by looking for a file at a path or enumerating a directory will fail. Instead, check for library presence by attempting to dlopen() the path, which will correctly check for the library in the cache. (62986286)

To actually inspect the dylibs, I needed to use this:

Now that I can actually inspect the dylibs I can see the symbols are actually mangled.

% cd /tmp/libraries/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A
% nm -gU libSparse.dylib | grep SparseMultiply                                                                      
0000000181087838 T __SparseMultiplySubfactor_Double
00000001810ac724 T __SparseMultiplySubfactor_Float
000000018109a880 T __Z14SparseMultiply18SparseMatrix_Float17DenseMatrix_FloatS0_
000000018109a8e0 T __Z14SparseMultiply18SparseMatrix_Float17DenseVector_FloatS0_
0000000181091694 T __Z14SparseMultiply19SparseMatrix_Double18DenseMatrix_DoubleS0_
00000001810916f4 T __Z14SparseMultiply19SparseMatrix_Double18DenseVector_DoubleS0_
000000018109ed6c T __Z14SparseMultiply27SparseOpaqueSubfactor_Float17DenseMatrix_Float
000000018109f5c4 T __Z14SparseMultiply27SparseOpaqueSubfactor_Float17DenseMatrix_FloatPv
000000018109f008 T __Z14SparseMultiply27SparseOpaqueSubfactor_Float17DenseMatrix_FloatS0_
000000018109f7e8 T __Z14SparseMultiply27SparseOpaqueSubfactor_Float17DenseMatrix_FloatS0_Pv
000000018109fd34 T __Z14SparseMultiply27SparseOpaqueSubfactor_Float17DenseVector_Float
000000018109fdf4 T __Z14SparseMultiply27SparseOpaqueSubfactor_Float17DenseVector_FloatPv
000000018109fd8c T __Z14SparseMultiply27SparseOpaqueSubfactor_Float17DenseVector_FloatS0_
000000018109fe50 T __Z14SparseMultiply27SparseOpaqueSubfactor_Float17DenseVector_FloatS0_Pv
0000000181095b80 T __Z14SparseMultiply28SparseOpaqueSubfactor_Double18DenseMatrix_Double
00000001810963d8 T __Z14SparseMultiply28SparseOpaqueSubfactor_Double18DenseMatrix_DoublePv
0000000181095e1c T __Z14SparseMultiply28SparseOpaqueSubfactor_Double18DenseMatrix_DoubleS0_
00000001810965fc T __Z14SparseMultiply28SparseOpaqueSubfactor_Double18DenseMatrix_DoubleS0_Pv
0000000181096b48 T __Z14SparseMultiply28SparseOpaqueSubfactor_Double18DenseVector_Double
0000000181096c08 T __Z14SparseMultiply28SparseOpaqueSubfactor_Double18DenseVector_DoublePv
0000000181096ba0 T __Z14SparseMultiply28SparseOpaqueSubfactor_Double18DenseVector_DoubleS0_
0000000181096c64 T __Z14SparseMultiply28SparseOpaqueSubfactor_Double18DenseVector_DoubleS0_Pv
0000000181091260 T __Z14SparseMultiplyd19SparseMatrix_Double18DenseMatrix_DoubleS0_
0000000181091594 T __Z14SparseMultiplyd19SparseMatrix_Double18DenseVector_DoubleS0_
000000018109a44c T __Z14SparseMultiplyf18SparseMatrix_Float17DenseMatrix_FloatS0_
000000018109a780 T __Z14SparseMultiplyf18SparseMatrix_Float17DenseVector_FloatS0_
000000018109a918 T __Z17SparseMultiplyAdd18SparseMatrix_Float17DenseMatrix_FloatS0_
000000018109acac T __Z17SparseMultiplyAdd18SparseMatrix_Float17DenseVector_FloatS0_
000000018109172c T __Z17SparseMultiplyAdd19SparseMatrix_Double18DenseMatrix_DoubleS0_
0000000181091ac0 T __Z17SparseMultiplyAdd19SparseMatrix_Double18DenseVector_DoubleS0_
000000018109178c T __Z17SparseMultiplyAddd19SparseMatrix_Double18DenseMatrix_DoubleS0_
0000000181091af8 T __Z17SparseMultiplyAddd19SparseMatrix_Double18DenseVector_DoubleS0_
000000018109a978 T __Z17SparseMultiplyAddf18SparseMatrix_Float17DenseMatrix_FloatS0_
000000018109ace4 T __Z17SparseMultiplyAddf18SparseMatrix_Float17DenseVector_FloatS0_

Those can be exposed by dlsym.

julia> libSparse = dlopen("/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/libSparse.dylib", RTLD_NOW | RTLD_GLOBAL)
Ptr{Nothing} @0x00000003b81a5ca0

julia> dlsym(libSparse, :_Z14SparseMultiplyf18SparseMatrix_Float17DenseMatrix_FloatS0_)
Ptr{Nothing} @0x000000018c60e44c

This really make this work, I suspect we will need GitHub - JuliaInterop/CxxWrap.jl: Package to make C++ libraries available in Julia

1 Like

Here’s a quick and dirty demo.

First, let’s start with the setup.

struct DenseMatrix_Double
    rowCount::Cint
    columnCount::Cint
    columnStride::Cint
    attributes::UInt16
    data::Ptr{Cdouble}
end

dense_data = zeros(Float64, 3,3)
dense = DenseMatrix_Double(3, 3, 3, 0, pointer(dense_data))

dense2_data = ones(Float64, 3,3)
dense2 = DenseMatrix_Double(3, 3, 3, 0, pointer(dense2_data))

struct SparseMatrixStructure
    rowCount::Cint
    columnCount::Cint
    columnStarts::Ptr{Clong}
    rowIndices::Ptr{Cint}
    attributes::UInt16
    blockSize::UInt8
end

columnStarts = Clong[0, 2, 4, 5]
rowIndices = Cint[0, 2, 0, 1, 2]

matrix_structure = SparseMatrixStructure(
    3, 3, pointer(columnStarts), pointer(rowIndices), 0, 3
)

struct SparseMatrix_Double
    structure::SparseMatrixStructure
    data::Ptr{Cdouble}
end

data = Cdouble[1.0, 0.1, 9.2, 0.3, 0.5, 1.3, 0.2, 1.3, 4.5]
sparse_matrix_double = SparseMatrix_Double(matrix_structure, pointer(data))

using Libdl
libSparse = dlopen("/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/libSparse.dylib")
SparseMultiply = dlsym(libSparse, :_Z14SparseMultiply19SparseMatrix_Double18DenseMatrix_DoubleS0_)

Now let’s play with it in the REPL:

julia> include("accelerate_test.jl");

julia> dense_data
3×3 Matrix{Float64}:
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0

julia> dense2_data
3×3 Matrix{Float64}:
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0

julia> ccall(SparseMultiply, Nothing, (SparseMatrix_Double, DenseMatrix_Double, DenseMatrix_Double), sparse_matrix_double, dense, dense2)

julia> dense2_data
3×3 Matrix{Float64}:
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0

julia> using LinearAlgebra

julia> dense_data .= I(3)
3×3 Matrix{Float64}:
 1.0  0.0  0.0
 0.0  1.0  0.0
 0.0  0.0  1.0

julia> ccall(SparseMultiply, Nothing, (SparseMatrix_Double, DenseMatrix_Double, DenseMatrix_Double), sparse_matrix_double, dense, dense2)

julia> dense2_data
3×3 Matrix{Float64}:
 1.0  9.2  0.0
 0.0  0.3  0.0
 0.1  0.0  0.5


5 Likes

Thanks a lot for the prototype. I wonder how much work would it be to expose the linear solver.

Any suggestions on how to handle the mangling? Or should I just try to work it out manually?

For mangling or demangling there are existing tools such as the following

https://llvm.org/docs/CommandGuide/llvm-cxxfilt.html

The LLVM version seems advisable given that we are dealing with an Apple framework. It may be packaged as part of one of our LLVM JLLs.

It seems to be packaged as part of LLVM_Full_jll under tools/llvm-cxxfilt