I would like to solve a linear system A x=b where A is band by block. The bandwith can be big (~10). An example of such matrix is for example the one in the trapezoid method.

I can do a LU decomposition to solve this but I am expecting that many operations will be wasted by the zero filling. Is there a way to improve this? Perhaps use a sparse matrix even if the blocks are not sparse? Use BlockBandedMatrices ?

I would just try BandedMatrices.jl, BlockBandedMatrices.jl, and the generic SparseArrays.jl solvers, and see which ones work best for your matrix size and sparsity. (Typically, using a generic sparse solver will be slower — but only by a constant factor — than using a solver that is specialized for a banded structure, but YMMV.)

Generically, for a banded matrix (of bounded bandwidth) you can get asymptotically linear (\sim m) scaling to solve Ax=b for an m \times m matrix A, so it is definitely worth taking advantage of when m is large, compared to a dense solver which requires \sim m^3 work and \sim m^2 memory.

(A bandwidth of 10 is not particularly big, which is why I would suggest trying BandedMatrices.jl too even if the bands come in blocks.)

Thank you for your answer. I will try those and see if it’s worth to write a dedicated linear solver. Basically I am wondering if it worth implementing the method of condensation of parameters (see [1,2]) to speed up the computation of periodic orbits based on a collocation method.

[1] Lust, Kurt. “Improved Numerical Floquet Multipliers.” International Journal of Bifurcation and Chaos 11, no. 09 (September 2001): 2389–2410. https://doi.org/10.1142/S0218127401003486.

[2] Doedel, Eusebius, Herbert B. Keller, and Jean Pierre Kernevez. “NUMERICAL ANALYSIS AND CONTROL OF BIFURCATION PROBLEMS (II): BIFURCATION IN INFINITE DIMENSIONS.” International Journal of Bifurcation and Chaos 01, no. 04 (December 1991): 745–72. https://doi.org/10.1142/S0218127491000555.