The matrix \mathbf{A} is the result of a FE mesh with hexahedral elements of a solid mechanics problem, with 3 degrees of freedom (DoFs) per node. I have Dirichlet boundary conditions on the boundary and free internal nodes, thus the problem is constructed as follows
\mathbf{A} = \mathbf{Kt}[\textrm{iDoFs},\textrm{iDoFs}]\\
\mathbf{r} = \mathbf{Kt}[\textrm{iDoFs},\textrm{bDoFs}]\mathbf{s}[\textrm{bDoFs}]
where \mathbf{Kt} is the stiffness matrix of the entire domain, \textrm{bDoFs} are the indices of the boundary DoFs and \textrm{iDoFs} are the indices of the internal DoFs, \mathbf{s} is the entire DoFs vector. \mathbf{A} is Hermitian, with size(A)=(162372, 162372)
Depending on the position of the element, whether internal or on the boundary, each node can be surrounded from 1 up to 8 elements, which decides the number of of non zero entries per each row of the matrix, all diagonal entries are non zero. It is a quite structured matrix, which is difficult to obtain by random generation, if it helps can share the matrix and the vectors as JLD2
files.
This is what I tried
u = @btime A\r
@show norm(r-A*u);
3.843 s (52 allocations: 1.39 GiB)
norm(r - A * u) = 5.766109468498555e-12
u = @btime lu(A)\r
@show norm(r-A*u);
10.340 s (72 allocations: 2.39 GiB)
norm(r - A * u) = 4.227052608619862e-12
then with LinearSolve.jl
prob = LinearProblem(A,r);
sol = @btime LinearSolve.solve(prob, UMFPACKFactorization())
@show norm(r-A*sol.u);
10.984 s (94 allocations: 2.71 GiB)
norm(r - A * sol.u) = 4.227052608619862e-12
then using precodintioners from AlgebraicMultigrid.jl
with Krylov methods I get
m = AlgebraicMultigrid.ruge_stuben(A)
p = AlgebraicMultigrid.aspreconditioner(m)
sol = @btime LinearSolve.solve(prob, LinearSolve.KrylovKitJL_GMRES(), Pl=p)
@show norm(r-A*sol.u);
51.581 s (603 allocations: 471.36 MiB)
norm(r - A * sol.u) = 7.902120848944424e-5
and
m = AlgebraicMultigrid.smoothed_aggregation(A)
p = AlgebraicMultigrid.aspreconditioner(m)
sol = @btime LinearSolve.solve(prob, LinearSolve.KrylovKitJL_GMRES(), Pl=p)
@show norm(r-A*sol.u);
46.302 s (613 allocations: 477.56 MiB)
norm(r - A * sol.u) = 0.0001005812834686952
all of these have run on Jupyter-lab
notebook with a single thread. Different runs of @btime
differ of up to 1 sec.
Please do let me know if you have any question or need more information.
side note: lu
and UMFPACKFactorization()
terminate on the same solution:
sol = LinearSolve.solve(prob, UMFPACKFactorization())
u = lu(A)\r
@show norm(u-sol.u);
norm(u - sol.u) = 0.0