Here are some data points. Let us call it unintuitive.
pkrysl@samadira sparse_transpose_tests.jl % julia --project t.jl
[ Info: Matrix M = 20133 by N = 20133, sparsity = 0.0004966969651815428, nnz = 201362
[ Info: Benchmarking copy+transpose CSC
695.834 μs (12 allocations: 3.23 MiB)
[ Info: Benchmarking copy+transpose CSR
1.893 s (3 allocations: 3.02 GiB)
[ Info: Benchmarking gbtranspose CSC
823.708 μs (23 allocations: 3.69 MiB)
[ Info: Benchmarking gbtranspose CSR
1.071 s (18 allocations: 3.02 GiB)
[ Info: Benchmarking csr_transpose
663.541 μs (15 allocations: 3.53 MiB)
[ Info: Benchmarking csr_transpose_2
631.500 μs (9 allocations: 3.23 MiB)
pkrysl@samadira sparse_transpose_tests.jl %
If you feel so inclined, test on your particular architecture.
Some more: Rectangular matrices.
[ Info: Matrix M = 20133 by N = 60399, sparsity = 0.0004966969651815428, nnz = 604295
[ Info: Benchmarking copy+transpose CSC
1.987 ms (12 allocations: 9.37 MiB)
[ Info: Benchmarking copy+transpose CSR
7.601 s (3 allocations: 9.06 GiB)
[ Info: Benchmarking gbtranspose CSC
2.768 ms (29 allocations: 10.76 MiB)
[ Info: Benchmarking gbtranspose CSR
3.505 s (18 allocations: 9.06 GiB)
[ Info: Benchmarking csr_transpose
2.645 ms (15 allocations: 10.60 MiB)
[ Info: Benchmarking csr_transpose_2
2.305 ms (9 allocations: 9.68 MiB)
pkrysl@samadira sparse_transpose_tests.jl % julia --project t2.jl
[ Info: Matrix M = 60133 by N = 20133, sparsity = 0.0001662980393461161, nnz = 201839
[ Info: Benchmarking copy+transpose CSC
716.666 μs (12 allocations: 3.54 MiB)
[ Info: Benchmarking copy+transpose CSR
3.836 s (3 allocations: 9.02 GiB)
[ Info: Benchmarking gbtranspose CSC
1.047 ms (23 allocations: 4.92 MiB)
[ Info: Benchmarking gbtranspose CSR
3.233 s (18 allocations: 9.02 GiB)
[ Info: Benchmarking csr_transpose
784.334 μs (15 allocations: 3.54 MiB)
[ Info: Benchmarking csr_transpose_2
768.125 μs (9 allocations: 3.23 MiB)
All of the timings on
julia> versioninfo()
Julia Version 1.11.4
Commit 8561cc3d68d (2025-03-10 11:36 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: macOS (arm64-apple-darwin24.0.0)
CPU: 24 × Apple M2 Ultra
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, apple-m2)
Threads: 1 default, 0 interactive, 1 GC (on 16 virtual cores)