Parallelizing multiple Crank–Nicolson solvers

LaurentPlagne · March 12, 2021, 4:14pm

Not completely related to the original question on MT scaling, but on the specific example (multiple Tridiagonal systems with the same (nc=3200) size to be solved. You can achieve a faster solution with Thomas algorithm (without pre-factorization) applied in a simd manner on blocks of systems (you have to adapt the layout). You can see these slides and this paper. Note that this part at least can be easily ported to GPU.

jagot · March 13, 2021, 5:26am

This seems like a highly relevant discussion: Overhead of `Threads.@threads` - #29 by Elrod

Topic		Replies	Views
Best performance for initialising a variable number of matrices/vectors within a Crank-Nicolson scheme Performance question , solver	5	265	April 19, 2024
2D diffusion and crank nicolson Numerics modelling	0	305	May 28, 2023
SplitODEProblem Modelling & Simulations	5	266	June 17, 2023
Independent LU factorization of small matrices not faster with threads Performance question	10	705	October 5, 2020
Possible performance drop when using more than one socket threads Performance	2	346	May 29, 2021

Parallelizing multiple Crank–Nicolson solvers

Related topics