yes, the code is parallel and uses @threads
in a for
loop for assembling the global stiffness matrix from the elemental stiffness matrices, but no memory is allocated within the for
loop, all the required arrays are allocated in advance. I will try to remove that and report what happens.
what I meant by re-using is that the same factorization is used for a few different problems before a new stiffness matrix is assembled and factorized. That is because the cost for factorization is the highest and it makes sense to keep using the same factorized matrix at the cost of taking more iterations rather than assembling and factorizing at each step (that would take one iteration only)
reusing the factorization in the sense you showed in your post proved not to be effective in my case