 # Defining custom matrix-free operators for solving a system of equations in iterativesolvers.jl

I am solving elastodynamic+elastostatic equations for waves and fracture using a spectral element method in 2D.

One of the steps involves solving a large system of equations. I am currently using a preconditioned conjugate gradient with jacobi preconditioner. This step is very slow and does not converge for higher mesh resolution. I tried looking into Petsc but I am not sure if Petsc.jl works with julia 1.0.

I am trying to use iterativesolvers.jl to just play around with simple conjugate gradient method for a smaller problem. The issue is that my code has operators defined to construct the vectors from element level matrices and therefore I do not have a ‘matrix-form’ for `A` in the equation `Ax = b`.

The problem is defined as:
`K1u1 = K2u2`

I need to solve for `u1`, given `K1, K2, u2`. Right now, I use an initial guess for `u1` and assemble `K1u1` as well as `K2u2` from their local parts defined for each element in the finite element mesh. Given the function to assemble these vectors, how would I solve this using iterativesolvers? I have read that iterativesolvers gives the option to define custom operators, but I couldn’t find an example anywhere.

A second question is whether I can use the standard Petsc library written in c and call that from julia?

Any other input about solving this is also appreciated. @mohamed82008

1 Like

There’s an example in here:

which is described in this video at around 59 minutes in

Just update it to v1.0. That’s likely just `using LinearAlgebra` and using `mul!`

2 Likes

There is plenty of good stuff in Chris’s tutorial, so reading through it may give you ideas.

Regarding defining a linear operator, here is an example:

``````struct MatrixOperator{TM}
K::TM
end

LinearAlgebra.mul!(c, op::MatrixOperator, b) = mul!(c, op.K, b)
Base.size(op::MatrixOperator, i...) = size(op.K, i...)
Base.eltype(op::MatrixOperator) = eltype(op.K)
LinearAlgebra.:*(op::MatrixOperator, b) = mul!(similar(b), op.K, b)
``````

Since you are going matrix-free, you may consider using StaticArrays.jl for the element matrices. I am assuming the sizes of your matrices are known and fixed beyond a certain point in your code, so you can use a function barrier approach. Once you have a `Vector{<:SMatrix}` for your element matrices, you can:

1. Define your `mul!` in terms of this vector on the CPU, possibly making use of `Base.Threads` or `KissThreading.jl` for shared memory parallelism,
2. Move it to the GPU, e.g. `CuArrays.CuVector{<:SMatrix}` defining `mul!` with the help of CUDAnative.jl (example), or
3. Use a `DistributedArrays.jl` based approach which I haven’t played with yet.

Finally, if you can assemble the matrix without running out of memory, try the `AMGPreconditioner` from `Preconditioners.jl`. I noticed it works well when your matrix is not well-conditioned, giving some clock-time savings.

I don’t have experience with PETSc.jl but it does seem a bit rusty. Note however that DistributedArrays.jl defines `mul!` for a `DMatrix` so you only need to worry about the assembly of the matrix.

Extra nuggets: JuAFEM has an implementation of the coloring algorithm, and an example on shared-memory parallel assembly.

Good luck!

1 Like

Thanks for the suggestion and sorry for extremely late reply (I had my PhD quals). Both yours and Chris’s example helped me understand the basics of custom operators, but I am unable to implement it on my code. Here is what I have so far:

``````struct MatrixOperator
P::parameters
S::input_variables
size::Int64
end

Base.size(op::MatrixOperator, i...) = op.size
Base.eltype(op::MatrixOperator) = Float64

# Matrix-free multiplication operator
function LinearAlgebra.mul!(C,A::MatrixOperator,b)
for eo = 1:A.P.Nel    # loop over elements
ig = A.S.iglob[:,:,eo]    # index to convert from local to global
locall = local_calc(A.P, b[ig], A.S.H, A.S.Ht, S.W[:,:,eo])

# Assemble into global vector
C[ig] .+= locall
end
return C
end

function local_calc(P, locall, H, Ht, Wlocal)
# This function is the quadrature integration over each element given the weights
d_xi = Ht*locall
d_eta = locall*H

# Element contribution to the internal forces
locall = P.coefint1*H*(Wlocal.*d_xi) +
P.coefint2*(Wlocal.*d_eta)*Ht
return locall
end

LinearAlgebra.:*(A::MatrixOperator,B::AbstractVector) = (C = similar(B); LinearAlgebra.mul!(C,A,B))

a_local::Array{Float64} = zeros(S.nglob)

A = MatrixOperator(P,S,length(dnew))
a_local = LinearAlgebra.mul!(a_local,A,similar(a_local))
Fnew = -a_local[S.FltNI]  # this is the RHS of AX = B

U = cg(A,Fnew)
return  norm(A*U - Fnew)
end
``````

I feel like I have still not got the syntax completely. I need to pass in the parameters § and input variables (S) into my operator. Do you have any ideas on this?

My final problem at a good resolution has > 3 million degrees of freedom, so it is not possible to assemble the matrix. Also, in the case of shared memory parallelism using Base.threads, how do I achieve reduction of matrices (as shown in the function mul!)? atomic_add works only with integers so I don’t know how to use shared parallelism in this case.

Thanks a lot for your help.

To get the syntax right, the documentation will be most useful https://docs.julialang.org/en/v1/manual/types/index.html#Parametric-Types-1. To do parallel assembly of the result vectors, you can need to make sure to partition your mesh using the coloring algorithm to eliminate the need for locks without creating a race condition.

1 Like

Thanks a lot. I will try the coloring algorithm.