Is there a package for block-sparse matrix processing?

photor · October 11, 2025, 8:42am

As in the title, I mean every block is dense, but there are only a small percentage of non-vanishing blocks.

karei · October 11, 2025, 9:07am

Here you are

(Just search for the keyword “block matri” here)

photor · October 11, 2025, 2:35pm

Thanks, but this one is actually the opposite of what I want, unfortunately. It can define block matrices, in which each block can be sparse. In my case, each block is dense.

Palli · October 11, 2025, 5:31pm

Is something like this what you want:
Introduction · BlockSparseMatrices.jl Or CompressedSparseBlocks.jl?

Note searching for “block matrix” gave many results, unlike “block matri” only one at JuliaHub. and seemingly better without a hyphen.

What kind of storage format do you want? BSR seems to be it or BCSR, or even consider Blocked-Ellpack format:

Do you have some kind of structure? E.g. are there as many blocks per row? It seems better for [blocked] Ellpack:

The Sliced Ellpack format is standardized and well-known as the state of the art. ..
[different section below]
The Blocked Ellpack format is similar to the standard Ellpack, where the column indices represent two-dimensional blocks instead of a single matrix entry.

Do you want to run on a GPU or not? Then consider cuSPARSE, but I also see cuSPARSELt I wasn’t familiar with. You have access to the former, and seemingly also some access to the latter with CUDA.jl:

https://www.osti.gov/servlets/purl/1366653

We examine the implementation of block compressed row storage (BCSR) sparse matrix-vector multiplication (SpMV) for sparse matrices with dense block sub-structure, optimized for blocks with sizes from 2x2 to 32x32, on CPU, Intel many-integrated-core, and GPU architectures. .. We give a set of algorithms that performs SpMV up to 4x faster than the NVIDIA cuSPARSE cusparseDbsrmv routine, up to 147x faster than the Intel Math Kernel Library (MKL) mkl dbsrmvroutine (a single-threaded BCSR SpMV kernel), and up to 3x faster than the MKL mkl dcsrmv routine (a multi-threaded CSR SpMV kernel).

What’s your application? ML/LLMs then see also:

One of the most challenging research questions around sparse MoE is therefore how to run it efficiently on machines that were never intended to run sparse computations in the first place.

This week, we’ll meet MegaBlocks (Gale et al 2022), the perhaps most efficient and scalable sparse MoE implementation in the industry today. The key idea is to re-formulate sparse MoE as a single block-sparse matrix multiplication instead of multiple dense matrix multiplications by leveraging a data format specifically designed to store and manipulate sparse matrices.

§5.1.3 describes our hybrid blocked-CSR-COO sparse matrix format, which enables efficient matrix products with sparse input and output operands.

Is it matrix-matrix multiply yoy need, and then both same format or one of dense? Or matrix-vector multiply?

photor · October 12, 2025, 2:19pm

Thank you for so much useful infomation. BlockSparseMatrices.jl seems to do exactly what I want. I need to check.
I need such data structure for domain decomposition in numerical simulation, typically to solve a large scale 2D elliptic equation by decomposing the computational domain into subdomains, in which we use Chebyshev pseudospectra. CPU is enough for the moment.

Palli · October 12, 2025, 7:46pm

It has matrix‑vector product so can I assume you need that (and only that?). I didn’t dig into what format is used, but I’m thinking if you want GPUs later, or even just CUDA, they you might want supporting that, or even both. I believe you could even just use Julia’s default sparse format (for correctness), and the only reason for looking for this is that it would be faster. I’m just thinking should some of these packages better serve most users doing sparse? I even think Julia’s sparse capabilities should just go (eventually, to just live in a package), since Julia itself doesn’t need it, nor most users, and those that do even something else. Just let Julia document some options? What do you think?

Topic		Replies	Views
Block Sparse Matrix General Usage sparse	2	1848	September 27, 2022
Block sparse matrix multiplication with repeated blocks Numerics	4	874	April 26, 2020
Spase by dense matrix product General Usage	0	476	September 23, 2017
Best way to use CuSparseMatrixBSR GPU question , gpu , linearalgebra , sparse	7	1096	August 21, 2022
CUSPARSE matrix-matrix multiplication not using GPU GPU	2	886	January 2, 2023

Related topics