Block diagonal matrices

Apologies if this has been discussed somewhere and I missed it in my searches, but I’m a bit confused by the behavior implemented for block diagonal matrices.

For example, I note that if I did something like

using LinearAlgebra
blocks = [rand(2,2) for _ in 1:5]
D      = Diagonal(blocks)
@show det(D), tr(D)

I get 2x2 matrices back. I recognize that this literally makes sense in the sense that we’re adding up the diagonals of a matrix whose elements are 2x2 matrices, but I think another natural way one might have expected this to behave is more like

_det(A) = prod(det.(A.diag))
_tr(A)  = sum(tr.(A.diag))

that treat the block diagonal structure simply as a convenient and lightweight wrapper around the sparse matrix whose nonzero entries are diagonal blocks.

I expect that people thought about this and decided on the current implementation (and I see that the above functionality that does exist is actually quite new), so I’m curious to hear about why.

I think it would also be nice, for example, if there were some functionality whose use resembled

Diagonal(blocks)\rand(10)

where now Diagonal is really just a thin wrapper that means I don’t have to write a careful loop that does soemthing like blocks[j] \ x[some_indices(j)] and so on. Maybe functions diagapply or diagsolve or something like that exist and I’m just not aware of them?

I would be interested to hear people’s thoughts on this. I have seen similar packages (like BlockArrays.jl), but unless I’ve missed something in the docs they don’t exactly implement the functionality that I’m describing.

If this is something that people are interested in having but doesn’t exist yet, I’d be happy to contribute something (although I do not consider myself to write code that is of comparable quality to existing software in JuliaMatrices).

We (@LaurentPlagne and myself) make heavy use of such structured block matrices, i.e. matrices which are defined as having a structure (diagonal, tridiagonal) where elements are blocks (being potentially structured block matrices themselves), but still behave globally like a matrix of real (or rather floating-point) coefficients.

The last time we needed this was when we started learning Julia, and we ended up implementing our own library for this (mostly because we didn’t find any at first glance and it was a good exercise to learn Julia). But we were also relatively new to the language at the time, and our code is probably not the best to share. If you’d be interested in such a library, maybe we should start a new project and collaboratively work on this (after having performed a more serious search in the Julia ecosystem, like you’re currently doing…)

1 Like

Hi @ffevote, thanks for the reply. I do think it would be nice if what we’re thinking of were in a library somewhere (if not included in LinearAlgebra itself).

In principle, I think that the basic tools that we (or at least I) have in mind could be a very simple addition somewhere, so maybe it would make sense to contribute them to an even tangentially related package that already exists (or, even better, the LinearAlgebra module). But for things like block-tridiagonal matrices where basically every operation couldn’t be extended simply by broadcasting, I would think a package definitely makes sense.

A minimal implementation that provides a thin wrapper for broadcasting and careful loops for applies/solves takes only about 60 loc (example below). It is obviously not complete or particularly optimized, but I also would guess that any implementation wouldn’t be able to get too much more efficient. Did your implementation have a much broader scope than this?


module BlockDiagonal

  using  LinearAlgebra
  import LinearAlgebra: factorize, ldiv!, mul!, adjoint

  struct BDiagonal{T, MT}
    sz :: Tuple{Int64, Int64}
    V  :: Vector{MT}
  end

  BDiagonal(V) = BDiagonal{eltype(first(V)), eltype(V)}(mapreduce(size, (x,y)->x.+y, V), V)

  ## Basic functionality:

  Base.getindex(BD::BDiagonal{T, MT}, j::Int64) where{T, MT} = BD.V[j]

  Base.size(BD::BDiagonal{T, MT}) where {T,MT} = BD.sz

  function Base.collect(BD::BDiagonal{T, MT}) where{T,MT}
    (out, ri, ci) = zeros(T, size(BD)), 1, 1
    for Dj in BD.V
      (sz1, sz2) = size(Dj)
      @inbounds view(out, ri:(ri+sz1-1), ci:(ci+sz2-1)) .= Dj
      ri += sz1 
      ci += sz2
    end
    out
  end

  ## Basic linear algebra:

  factorize(BD::BDiagonal{T, MT}) where{T,MT} = BDiagonal(factorize.(BD.V))

  adjoint(BD::BDiagonal{T, MT}) where{T,MT}   = BDiagonal(adjoint.(BD.V))

  for (fn, ar) in Iterators.product((:ldiv!, :mul!), (:StridedVector, :StridedMatrix))
    @eval begin
      function $fn(target::$ar{T}, BD::BDiagonal{T, MT}, src::$ar{T}) where{T, MT}
        ri = 1
        for Dj in BD.V
          (sz1, sz2) = size(Dj)
          @inbounds $fn(view(target, ri:(ri+sz1-1), :), Dj, view(src, ri:(ri+sz1-1), :))
          ri += sz1
        end
        target
      end
    end
  end

  function LinearAlgebra.:\(BD::BDiagonal{T, MT}, src::StridedArray{T}) where{T, MT}
    target = similar(src)
    MT <: Union{Factorization{T}, Adjoint{T,<:Factorization{T}}} && return ldiv!(target, BD, src)
    ldiv!(target, factorize(BD), src)
  end

  Base.:*(BD::BDiagonal{T, MT}, src::StridedArray{T}) where{T, MT} = mul!(similar(src), BD, src)

end


Hi Chris,

François and I did not really came to a definitive conclusion on the best way to deal with (multi-)block matrices in Julia.

We work on PDEs where fields laying in multidimensional discretized (phase-)spaces are related with differential operators that are naturally expressed by tensorial linear operators which are represented by (multi-)block matrices.
For example, using the canonical Poisson’s equation on a Cartesian grid:

\Delta V_{xy}= [\partial^2_x + \partial^2_y] V_{xy}=[L_x\otimes I_y + I_x\otimes L_y]V_{xy}=[A_{xy}+B_{xy}]V_{xy}=-\rho_{xy}

In this case, B_{xy}= I_x\otimes L_y can be seen as a a diagonal block matrix of (e.g.) tridiagonal matrices.
If i (resp. j= correspond to $x (resp. y) grid indexes, then
B_{xy}(i,i^{'})=\delta_{i,i^{'}}L_y and then B_{xy}(i,i^{'})(j,j^{'})=\delta_{i,i^{'}}(2\delta_{j,j^{'}}-\delta_{j,j^{'}-1}-\delta_{j,j^{'}+1}).

In this case an element (i,i^{'}) of a 2D matrix B_{xy} is a 1D(=standard) matrix. The corresponding 2D vectors like V_{xy} would behave as vector of vectors (and not like 2d arrays):
V_{xy}[i] is a 1D vector V^i_{xy} with ny elements V^i_{xy}[j].

Following this approach, the recursive operations on MD matrices and vectors applies naturally. For example a 2D matrix-vector product B_{xy}V_{xy} implies inner matrix vector products:
(B_{xy}V_{xy})[i][j]=\Sigma_{i^{'}} B(i,i^{'})V[i^{'}]=\Sigma_{i^{'}}(\Sigma_{j^{'}}B(i,i{'})(j,j{'})V[i^{'}][j^{'}])

We did implement a C++ library for handling such MD matrices and MD vectors and start to reimplement it in Julia. Since there is no ND-arrays in C++ our approach was quite natural but we still wonder about the best mix with Julia ecosystem. We also do not finish to look for similar Julia projects: if someone is considering the same topic please do let us know so we can collaborate !

Here is a an example of a 5-level matrix:

Some more details can be found here Legolas++

4 Likes

Hi Laurent,

That is very interesting! I wish I knew more about numerical PDEs. The block-sparse structure that your example image has looks very familiar to things I have seen in passing. This structure looks a bit more involved, and for things that are not block diagonal I think implementing any linear algebra would be much harder because we couldn’t just do a simple broadcasting.

From a quick look at the source code, the package BlockArrays.jl seems happy to take blocks of user-defined types so long as they are a subtype of AbstractArray, so I think the combination of a home-spun block diagonal or block tridiagonal matrix or something like that with BlockArrays.jl might provide a pretty quick solution. I don’t know that package very well myself, but I’m sure that even if it doesn’t support block sparsity one could write a dummy struct for blocks that just returns the zero vector or something of the sort.

What operations do you need to do with these matrices? If you just need solves and you’re okay with iterative solvers, putting something like that together would probably be easy enough. But I could imagine it becoming very hard if you need much more than that.

1 Like

Thank you for the link ! I have to take a serious look at BlockArrays.jl.

Basically, all I need to form recursively blocked matrices is a sparse block matrix structure that can contain blocks b than can perform MV product mul!(y,b,x) and optionally MV solve ldiv!(y,b,x). Of course, a sparse block matrix can contain sparse block matrices and may be suitable for prod and solve operations.
Hence a recursive solver is trivial to implement. Let consider that a sparse bock matrix A contains invertible diagonal blocks A_{ii}, then a basic sparse Gauss Seidel solver would imply operations
like (assuming that we can iterate on non-zero elements):
X_i=A_{ii}^{-1}(B_{i} -\Sigma_{i,j\neq j}A_{ij}X_j)

If A is a sparse matrix then the A_{ij}X_j correspond an inner matrix products and the A_{ii}^{-1} corresponds to an inner linear solver. So the Sparse GS is naturally recursive.

The important point is that the prod and solve operations must be properly specialized for the different types of matrices and block matrices. This is how Legolas works and, in practice, it is not necessary to have specialization for sparse block matrix type since all the real work is done for the leaf (non-blocked) matrices. The performance of the leaf operations is the key and depend strongly on a proper specialization against leaf block matrix types.

Like I said, we do implement this in Julia but my concern is that a matrix A_{i,j} must have 2 indices and returns either a scalar or a matrix element with two indices. Similarly, a 2D vector X_i has only one index and elements are 1D vectors with one index: X[i][j] and not X[i,j]. Of course all the elements of X are stored in a contiguous 1D Julia Array and X[i] is just a thin wrapper handling the proper data section. This approach leads to a new definition of MD matrix and vectors which is different from standard Julia arrays and I wonder about this cohabitation: is it necessary ?, is it sustainable, is it error prone ?

1 Like

Hi Laurent,

I’m a little worried that I’m not understanding your concern. With regard to the last paragraph, I would think that if you wrapped everything in a struct you could define Base.getindex to behave however you wanted, so that you could write A[i,j] or A[i][j] in such a way that a recursive GS code could work smoothly.

At the moment, I’m a little nervous to make a commitment to co-develop software that is for a domain that I don’t know particularly well (as I am concerned about having the time to do right by co-developers), so I think I may not be in a position to be too helpful about developing a new package that doesn’t just make a sort of trivial block diagonal matrix structure with simple functionality. But if somebody else has thoughts or you end up with a solution that you’re happy with, I would love to see it and learn!

Exactly. It is easy to define such a thing in Julia. The concern is not how to do it, but rather whether this is the most idiomatic way to do it, especially knowing that multi-dimensional arrays are already “first-class citizens” in the language.

2 Likes

Hi François,

Ah, now I understand. That’s a good question that I hadn’t thought about very seriously before. I would be interested to hear thoughts on the question from any devs who happen upon this thread…

In the mean time (and in case somebody looking for code stumbles on this thread in the future), I’ve released a stupid little package BlockDiagonal.jl that has the functionality that I was looking for a few days ago. It is literally 60 lines of code and really doesn’t do much but provide the methods that can obviously be specialized for block diagonal matrices. It is just a slightly refined version of the code pasted a few posts above.

3 Likes