Best performance for initialising a variable number of matrices/vectors within a Crank-Nicolson scheme

I’m solving a diffusion problem of several connected domains. I use a Crank-Nicolson scheme on each domain, and at every time step I connect the domains by some matching conditions at the boundary. To do the Crank-Nicolson I have to do u = A\b, at every time step for each domain, where u is the solution vector.

Now I’d like to make the code general in terms of the number of domains. Previously, for n=3 I used to have something like

A_1 = ...
A_2 = ...
A_3 = ...

similar for u and b, with the difference that u and b change at every time step.
I’d then do

u_1 = A_1\b_1
u_2 = A_2\b_2
u_3 = A_3\b_3

When going to the arbitrary n case, I naively just used list comprehensions for creating all the matrices and vectors, however, that’s associated with a 10-20% performance hit. I was wondering about the best strategy here.

1 Like

Welcome to Julia Discourse!

You will find many experts here on numerical linear algebra and solutions of partial differential equations. I am not one of them, but I do have a couple of suggestions.

I think you might get more responses if you showed in more explicit detail what you’ve attempted so far for the general case of n regions.

Also, if the A matrices are independent of time step, then I think you could get large execution time savings by precomputing and reusing the factorizations of these matrices. I.e., if make_region_A(i) is called to create the A matrix for region i, then I would replace, e.g.,

As = [make_region_A(i) for i in 1:n]


factored_As = [factorize(make_region_A(i)) for i in 1:n]

and then for each time step the solves would look like

for i in 1:n
    ldiv!(factored_As[i], b[i])

This avoids unnecessary allocations of new u vectors at each time step.

I’m sure the experts will chime in with lots of additional (and better) suggestions.


If the matrices stay the same, while only u and b change on each timestep, then you should definitely precompute the matrix factorizations.

e.g. precompute F_1 = lu(A_1) (or cholesky if it is PSD), and then do u_1 = F_1 \ b_1. Better yet, update u in-place with ldiv!(u_1, F_1, b_1).

(Oh, I see that @PeterSimon already made a similar suggestion.)

If you’re spending lots of time allocating vectors, then you should probably be allocating them once and then working in-place.


Also, is there a specific reason you are using hand-written Crank-Nicolson? You likely will get better results with OrdinaryDiffEq.jl which would let you use much fancier solvers easily.


Great, thanks everyone! The factorisation / lu decomposition made a 20% difference already. @Oscar_Smith : I’m interested in this! So here’s the problem stated a bit more formally. Could you give me a few keywords/hints on how to use OrdinaryDiffEq.jl effectively to solve this quicker than using backslash?

I thought the biggest challenge was to use an efficient solver while at the same time maintaining enough flexibility to update the matching conditions between domains at every time step. Otherwise, of course, a variable time step, for example, could hugely improve things…

1 Like

5 posts were split to a new topic: Why re-use factorization(A) rather than inv(A)