Best performance for initialising a variable number of matrices/vectors within a Crank-Nicolson scheme

lhcgeneva · April 17, 2024, 3:47pm

I’m solving a diffusion problem of several connected domains. I use a Crank-Nicolson scheme on each domain, and at every time step I connect the domains by some matching conditions at the boundary. To do the Crank-Nicolson I have to do u = A\b, at every time step for each domain, where u is the solution vector.

Now I’d like to make the code general in terms of the number of domains. Previously, for n=3 I used to have something like

A_1 = ...
A_2 = ...
A_3 = ...

similar for u and b, with the difference that u and b change at every time step.
I’d then do

u_1 = A_1\b_1
u_2 = A_2\b_2
u_3 = A_3\b_3

When going to the arbitrary n case, I naively just used list comprehensions for creating all the matrices and vectors, however, that’s associated with a 10-20% performance hit. I was wondering about the best strategy here.

PeterSimon · April 17, 2024, 8:52pm

Welcome to Julia Discourse!

You will find many experts here on numerical linear algebra and solutions of partial differential equations. I am not one of them, but I do have a couple of suggestions.

I think you might get more responses if you showed in more explicit detail what you’ve attempted so far for the general case of n regions.

Also, if the A matrices are independent of time step, then I think you could get large execution time savings by precomputing and reusing the factorizations of these matrices. I.e., if make_region_A(i) is called to create the A matrix for region i, then I would replace, e.g.,

As = [make_region_A(i) for i in 1:n]

with

factored_As = [factorize(make_region_A(i)) for i in 1:n]

and then for each time step the solves would look like

for i in 1:n
    ldiv!(factored_As[i], b[i])
end

This avoids unnecessary allocations of new u vectors at each time step.

I’m sure the experts will chime in with lots of additional (and better) suggestions.

stevengj · April 17, 2024, 8:59pm

If the matrices stay the same, while only u and b change on each timestep, then you should definitely precompute the matrix factorizations.

e.g. precompute F_1 = lu(A_1) (or cholesky if it is PSD), and then do u_1 = F_1 \ b_1. Better yet, update u in-place with ldiv!(u_1, F_1, b_1).

(Oh, I see that @PeterSimon already made a similar suggestion.)

If you’re spending lots of time allocating vectors, then you should probably be allocating them once and then working in-place.

Oscar_Smith · April 17, 2024, 9:43pm

Also, is there a specific reason you are using hand-written Crank-Nicolson? You likely will get better results with OrdinaryDiffEq.jl which would let you use much fancier solvers easily.

lhcgeneva · April 19, 2024, 6:56am

Great, thanks everyone! The factorisation / lu decomposition made a 20% difference already. @Oscar_Smith : I’m interested in this! So here’s the problem stated a bit more formally. Could you give me a few keywords/hints on how to use OrdinaryDiffEq.jl effectively to solve this quicker than using backslash?

I thought the biggest challenge was to use an efficient solver while at the same time maintaining enough flexibility to update the matching conditions between domains at every time step. Otherwise, of course, a variable time step, for example, could hugely improve things…

stevengj · April 19, 2024, 12:35pm

5 posts were split to a new topic: Why re-use factorization(A) rather than inv(A)

Topic		Replies	Views
Parallelizing multiple Crank–Nicolson solvers Performance linearalgebra	21	1437	March 13, 2021
Improving Performance of For-Loop and Matrix Multiplication for Time Integration Solver Performance loops , matrices	8	1133	June 1, 2020
Solving a linear system without modifying and allocating arrays Performance	3	133	March 25, 2025
Optimising iterative solution of small system Performance linearalgebra	12	469	April 16, 2024
Speeding up solving `Ax=b` repeated when `A` is partially updated Performance linearalgebra	4	872	December 29, 2022

Best performance for initialising a variable number of matrices/vectors within a Crank-Nicolson scheme

Related topics