Best practice: threaded sparse matrix assembly

tduretz · June 4, 2026, 8:22pm

Thank you for the answer, I will read through the paper!

Here, I paste a naive MWE, which I guess probably shows how things should not be done. Improvements are welcome !

using Base.Threads, ExtendableSparse, LinearAlgebra

@views function serial_assembly!(M, n)
    for i=1:n
        if i==1
            M[i,i  ] =  1.0  
            M[i,i+1] = -1.0
        elseif i==n
            M[i,i-1] = -1.0  
            M[i,i  ] =  1.0
        else
            M[i,i-1] = -1.0  
            M[i,i  ] =  2.0
            M[i,i+1] = -1.0
        end
    end
end

@views function threaded_assembly!(M, M_loc, n)
    Threads.@threads for i=1:n
        tid = threadid()
        if i==1
            M_loc[tid-1][i,i  ] =  1.0  
            M_loc[tid-1][i,i+1] = -1.0
        elseif i==n
            M_loc[tid-1][i,i-1] = -1.0  
            M_loc[tid-1][i,i  ] =  1.0
        else
            M_loc[tid-1][i,i-1] = -1.0  
            M_loc[tid-1][i,i  ] =  2.0
            M_loc[tid-1][i,i+1] = -1.0
        end
    end
end

let 
    n  = 10_000_000
    # Serial
    M1 = ExtendableSparseMatrix(n,n)
    @time serial_assembly!(M1, n)
    # Threaded
    M2 = ExtendableSparseMatrix(n,n)
    M2_loc = [ExtendableSparseMatrix(n,n) for i=1:nthreads()]
    @time threaded_assembly!(M2, M2_loc, n)
    # Reduce. Looks like a conversion to sparse is needed to allow for .+
    # This makes things a slightly faster
    @time for k=1:nthreads()
        M2 .= sparse(M2) .+ sparse(M2_loc[k])
    end
    # Check
    @show norm(M1 .- M2)
end

In this 1D example, the reduction is super slow (threaded slower than serial). For 2D problems, there is a bit of benefit of doing this, but it’s still a very inefficient way. What could be the most straightforward way to thread such type of assembly?

Topic		Replies	Views
Parallel assembly of a finite element sparse matrix General Usage parallel	34	2168	February 28, 2024
Speeding up sparse matrix multiplication and assembly Numerics	8	3741	June 24, 2019
Parallel Assembly of Matrices in FEM Performance	8	572	January 16, 2025
Multithreading with sparse arrays Numerics	3	1847	May 6, 2021
From toy to production code with sparse matrix and sparse-direct factorizations Performance sparse , cholmod	17	1385	November 1, 2022

Best practice: threaded sparse matrix assembly

Related topics