Most sparse solvers allow you to cache the result that is specific for the sparsity pattern (e.g. Pardiso, wrapped in Pardiso.jl).

I don’t see what this has to do with type parameters.

Most sparse solvers allow you to cache the result that is specific for the sparsity pattern (e.g. Pardiso, wrapped in Pardiso.jl).

I don’t see what this has to do with type parameters.

1 Like

Well, it’s an idea that to an idea that to led to an idea! I am not saying this is the best way to do it, probably not. I will check out Pardiso.jl thanks for that.

I believe I found a potentially good application in FEA and topology optimization. As you know, the global stiffness matrix is nothing but the assembly of element stiffness matrices which are sparse and small but not too small, not more than 100s of non-zeros for all practical purposes, usually much less. If the elements are all of the same type and similar interpolation and quadrature rules are used, then the sparsity pattern will be identical across the elements. Even in a heterogeneous problem, this number of unique sparsity patterns will be very small.

Now, instead of assembling the global stiffness matrix then multiplying it by some vector in the context of an iterative solver, one could possibly multiply each element stiffness matrix by a view into the indices corresponding to the relevant degrees of freedom of the solution vector. Then the result vectors can be assembled after each multiplication. If the advantage introduced by a specialized matrix-vector multiplication for the sparsity pattern (along with SIMD and what not) is large enough to make up for the cost of vector assembly in every iteration, then this approach may beat the standard solver approach.

Whether that happens or not, I am not sure, but at least there is reason to keep going with the experiment. If you still think otherwise, I would love to hear your thoughts. It might be a tough nut to crack but maybe possible?

1 Like

This is known as a “matrix free method”. See e.g https://www.dealii.org/8.5.0/doxygen/deal.II/step_37.html

1 Like

Yes, that seems like a very elegant way to write matrix-free operators. I would give it a try!

2 Likes

Even more so, if the vector assembly function is generated, it can be written in a thread-safe way and then I believe the whole code becomes amenable to GPU acceleration via CUDAnative. I guess I will start prototyping soon!

1 Like