Blockdiag for dense matrices

Quite often I want to form dense block diagonal matrices from a few dense matrices. Is there an easy way to do this? There is blockdiag for sparse matrices:

help?> blockdiag
search: blockdiag

  blockdiag(A...)

  Concatenate matrices block-diagonally. Currently only implemented for sparse matrices.

Is there some reason that this hasn’t been implemented for dense matrices or is there some other way to achieve this? (preferably without using additional packages)

1 Like

Apply Matrix to the result of blockdiag?

blockdiag only accepts sparse matrices as inputs. Converting the dense matrices to sparse matrices and then converting the result to a dense matrix seems inefficient and inconvenient.

Perhaps methods in the BlockDiagonals package would be more suitable?

1 Like

A simple test:

julia> v = [rand(2,2), rand(3,3)];

julia> @benchmark BlockDiagonal(v)
BenchmarkTools.Trial:
  memory estimate:  16 bytes
  allocs estimate:  1
  --------------
  minimum time:     84.702 ns (0.00% GC)
  median time:      91.884 ns (0.00% GC)
  mean time:        110.563 ns (7.64% GC)
  maximum time:     56.262 μs (99.83% GC)
  --------------
  samples:          10000
  evals/sample:     961

julia> @benchmark begin
       Matrix(blockdiag(sparse.(v)...))
       end
BenchmarkTools.Trial:
  memory estimate:  2.66 KiB
  allocs estimate:  26
  --------------
  minimum time:     5.117 μs (0.00% GC)
  median time:      6.183 μs (0.00% GC)
  mean time:        7.641 μs (17.00% GC)
  maximum time:     7.730 ms (99.83% GC)
  --------------
  samples:          10000
  evals/sample:     6

Conclusion: creating the block diagonal matrix using package BlockDiagonals with method BlockDiagonal is roughly 60 times faster than converting between dense and sparse matrices – for this simple case.

If I convert the BlockDiagonal matrix to a regular Matrix, most of the advantage is lost:

julia> @benchmark collect(BlockDiagonal(v))
BenchmarkTools.Trial:
  memory estimate:  7.17 KiB
  allocs estimate:  77
  --------------
  minimum time:     3.275 μs (0.00% GC)
  median time:      4.275 μs (0.00% GC)
  mean time:        6.096 μs (25.25% GC)
  maximum time:     6.393 ms (99.90% GC)
  --------------
  samples:          10000
  evals/sample:     8

Finally, it is of interest to see how much slower multiplication with a BlockDiagonal matrix is:

julia> M_bl = BlockDiagonal(v);

julia> M = collect(M_bl);

julia> b = rand(5);

julia> @benchmark M_bl*b
BenchmarkTools.Trial:
  memory estimate:  624 bytes
  allocs estimate:  7
  --------------
  minimum time:     402.000 ns (0.00% GC)
  median time:      418.995 ns (0.00% GC)
  mean time:        642.634 ns (19.39% GC)
  maximum time:     261.817 μs (99.77% GC)
  --------------
  samples:          10000
  evals/sample:     200

julia> @benchmark M*b
BenchmarkTools.Trial:
  memory estimate:  128 bytes
  allocs estimate:  1
  --------------
  minimum time:     132.351 ns (0.00% GC)
  median time:      135.640 ns (0.00% GC)
  mean time:        179.146 ns (7.58% GC)
  maximum time:     69.557 μs (99.69% GC)
  --------------
  samples:          10000
  evals/sample:     881

Conclusion: matrix-vector multiplication when the matrix is BlockDiagonal is 3 times slower than when the matrix is a regular Matrix – for this simple example.

You benchmarked with nonconst globals.
That will not give realistic results.
You need to either make them const or interpolate them into the benchmark expression.

See docs: https://docs.julialang.org/en/latest/manual/performance-tips/#Avoid-global-variables-1
BenchmarkTools Readme: GitHub - JuliaCI/BenchmarkTools.jl: A benchmarking framework for the Julia language


also thats a pretty small matrix, its not surprising to see dense matrix multiplcation work better.
StaticArrays would likely be even faster still at that size.

Making functions of the various operations didn’t change much. Using BlockDiagonal to create the matrix with v = [rand(2,2), rand(3,3)] was ca. 50-60 times faster than if I created a sparse matrix, created the block diagonal, and converted back to dense matrix.

With v = [rand(200,200), rand(300,300)] instead, using BlockDiagonal was some 1e5 faster. Probably because the wrapping of sparse, blockdiagonal, and Matrix function creates temporary arrays. So I’d need to be more careful for a fair comparison. On the other hand, using BlockDiagonal is extremely simple, and seems to be well optimized.

Wrt. multiplying a matrix of type BlockDiagonal with a dense vector, this was some 3 times slower for the small matrix (5,5) compared to multiplying a dense matrix with a dense vector. However, for the large matrix (500,500), the computational times were relatively similar.

Hmm, BlockDiagonals seems great for exploiting the structure of block diagonal matrices, but I am simply looking for a convenient way of forming a (dense, relatively small) block diagonal matrix. I was hoping this could be done without using yet another package. It seems like a quite common thing to do (at least for me) and that it would be nice if there was a function for this in Base? In particular since there is the corresponding function for sparse matrices.

How’s this?

julia> blocks = [ rand(i,i) for i in 1:3 ];
julia> cat(blocks...,dims=(1,2))
6×6 Array{Float64,2}:
 0.448769  0.0       0.0       0.0       0.0       0.0      
 0.0       0.744482  0.225303  0.0       0.0       0.0      
 0.0       0.463063  0.332314  0.0       0.0       0.0      
 0.0       0.0       0.0       0.688196  0.263436  0.896587 
 0.0       0.0       0.0       0.694443  0.403921  0.0913202
 0.0       0.0       0.0       0.486398  0.299153  0.526531 
7 Likes

Julia, like TeX, is a language in which packages are the standard way to get functionality.
It’s largely a historical artifact that sparse matrixes have a standard library.

1 Like

@tobydriscoll, nice, yes it was something like that I was looking for, that looks as good as it gets without a dedicated method.

Actually stumbled upon a dedicated method. It seems to be a bit faster

julia> blocks = [ rand(i,i) for i in 1:3 ];
julia> @btime ControlSystems.blockdiag(blocks...)
  275.169 ns (3 allocations: 592 bytes)
julia> @btime cat(blocks...,dims=(1,2))
  9.080 μs (52 allocations: 2.31 KiB)

blockdiag is quite analogous to LinearAlgebra.diagm which is also a convenience function for creating dense matrices, so LinearAlgebra would be the appropriate place. But I take it that there are not so many others that have found a need for this.

1 Like