In the finite element numerical simulation, the last step is to calculate the linear equations of the overall stiffness matrix: K×d=F .
In julia, the method of solving linear equations can be directly “\”: d = K \ F

Question:

When the matrix is very large, what method can be used to speed up the calculation?
There are the following requirements on the method:

The calculation accuracy is best not to change

My computer is a 40-core workstation, maybe I can consider multi-core parallelism? (I’m not sure if there is such a way)

All the methods provided are preferably simple to implement (I don’t need to change too much code )

Welcome everyone to make suggestions, or what information can be used for reference, thank you all.

This is just a syntax, not a method. Which method gets called for K \ F depends on the types of K and F.

For example, if K is a sparse-matrix object, it typically calls a sparse-direct solver. You could easily create a wrapper object so that K \ F called an iterative solver like GMRES (e.g. from IterativeSolvers.jl), for example. We put together a draft PETSc.jl package (which hasn’t been updated in a while, but could be brought up to date if there is developer interest) in which you have a “KSP” wrapper for sparse matrices where \ calls a PETSc Krylov iterative method.

Which method you use is highly problem dependent. For finite-element methods, which produce sparse matrices, I would generally recommend using a sparse-direct solver (especially for 2d problems) until it runs out of memory (which happens for medium-size 3d problems). At that point you have to switch over to an iterative algorithm like GMRES/biCGSTAB2/DQGMRES/… (or conjugate-gradient if your problem is SPD), and at that point you have to know a bit more (especially how to precondition your problem) and some trial-and-error is often required to find a good algorithm.

An interesting follow-up: K\F is syntax, and the actually chosen method depends on the type of matrix K (and also of F?). But if the CPU has, say, 40 cores… will the chosen method know this and automatically parallelize the solver?

I don’t quite understand your question — it depends on the implementation of the method.

The stdlib dense solver is multithreaded (thanks to OpenBLAS). The stdlib sparse-direct solver is not. The MUMPS.jl sparse-direct solvers can use multiple processes. There are also various other packages with parallel solvers of various kinds. Iterative solvers with a distributed-array type are parallel, for example.

As to how many cores these parallel methods use, it depends on which parallelization method they use and how many threads or processes you launch Julia with.

Thanks for answers — I think you answered what I was thinking of. What I meant was:

Do the LinearAlgebra (and other) methods support parallelization and multiple cores?

If they support parallelization/multiple cores, will they on their own find out how many cores are available and activate these cores?

My understanding of your answers is:

Some methods support parallelization (those of the OpenBLAS library, i.e., for dense matrices? + some others), but it depends on the data type(?), and

The methods do not figure out how many cores the CPU has by themselves, i.e., the user has to specify the number of cores upon starting Julia from the command line/upon lanuch (?)