Defining custom matrix-free operators for solving a system of equations in iterativesolvers.jl

I am solving elastodynamic+elastostatic equations for waves and fracture using a spectral element method in 2D.

One of the steps involves solving a large system of equations. I am currently using a preconditioned conjugate gradient with jacobi preconditioner. This step is very slow and does not converge for higher mesh resolution. I tried looking into Petsc but I am not sure if Petsc.jl works with julia 1.0.

I am trying to use iterativesolvers.jl to just play around with simple conjugate gradient method for a smaller problem. The issue is that my code has operators defined to construct the vectors from element level matrices and therefore I do not have a ‘matrix-form’ for A in the equation Ax = b.

The problem is defined as:
K1u1 = K2u2

I need to solve for u1, given K1, K2, u2. Right now, I use an initial guess for u1 and assemble K1u1 as well as K2u2 from their local parts defined for each element in the finite element mesh. Given the function to assemble these vectors, how would I solve this using iterativesolvers? I have read that iterativesolvers gives the option to define custom operators, but I couldn’t find an example anywhere.

A second question is whether I can use the standard Petsc library written in c and call that from julia?

Any other input about solving this is also appreciated. @mohamed82008


There’s an example in here:

which is described in this video at around 59 minutes in

Just update it to v1.0. That’s likely just using LinearAlgebra and using mul!


There is plenty of good stuff in Chris’s tutorial, so reading through it may give you ideas.

Regarding defining a linear operator, here is an example:

struct MatrixOperator{TM}

LinearAlgebra.mul!(c, op::MatrixOperator, b) = mul!(c, op.K, b)
Base.size(op::MatrixOperator, i...) = size(op.K, i...)
Base.eltype(op::MatrixOperator) = eltype(op.K)
LinearAlgebra.:*(op::MatrixOperator, b) = mul!(similar(b), op.K, b)

Since you are going matrix-free, you may consider using StaticArrays.jl for the element matrices. I am assuming the sizes of your matrices are known and fixed beyond a certain point in your code, so you can use a function barrier approach. Once you have a Vector{<:SMatrix} for your element matrices, you can:

  1. Define your mul! in terms of this vector on the CPU, possibly making use of Base.Threads or KissThreading.jl for shared memory parallelism,
  2. Move it to the GPU, e.g. CuArrays.CuVector{<:SMatrix} defining mul! with the help of CUDAnative.jl (example), or
  3. Use a DistributedArrays.jl based approach which I haven’t played with yet.

Finally, if you can assemble the matrix without running out of memory, try the AMGPreconditioner from Preconditioners.jl. I noticed it works well when your matrix is not well-conditioned, giving some clock-time savings.

I don’t have experience with PETSc.jl but it does seem a bit rusty. Note however that DistributedArrays.jl defines mul! for a DMatrix so you only need to worry about the assembly of the matrix.

Extra nuggets: JuAFEM has an implementation of the coloring algorithm, and an example on shared-memory parallel assembly.

Good luck!


Thanks for the suggestion and sorry for extremely late reply (I had my PhD quals). Both yours and Chris’s example helped me understand the basics of custom operators, but I am unable to implement it on my code. Here is what I have so far:

struct MatrixOperator

Base.size(op::MatrixOperator, i...) = op.size
Base.eltype(op::MatrixOperator) = Float64

# Matrix-free multiplication operator
function LinearAlgebra.mul!(C,A::MatrixOperator,b)
    for eo = 1:A.P.Nel    # loop over elements
        ig = A.S.iglob[:,:,eo]    # index to convert from local to global
        locall = local_calc(A.P, b[ig], A.S.H, A.S.Ht, S.W[:,:,eo])

        # Assemble into global vector
        C[ig] .+= locall
    return C

function local_calc(P, locall, H, Ht, Wlocal)
    # This function is the quadrature integration over each element given the weights
    d_xi = Ht*locall
    d_eta = locall*H

    # Element contribution to the internal forces
    locall = P.coefint1*H*(Wlocal.*d_xi) +
    return locall

LinearAlgebra.:*(A::MatrixOperator,B::AbstractVector) = (C = similar(B); LinearAlgebra.mul!(C,A,B))

function conj_grad(P::parameters, S::input_variables, initial_guess, F)

    a_local::Array{Float64} = zeros(S.nglob)

   A = MatrixOperator(P,S,length(dnew))
   a_local = LinearAlgebra.mul!(a_local,A,similar(a_local))
   Fnew = -a_local[S.FltNI]  # this is the RHS of AX = B

    U = cg(A,Fnew)
    return  norm(A*U - Fnew)

I feel like I have still not got the syntax completely. I need to pass in the parameters § and input variables (S) into my operator. Do you have any ideas on this?

My final problem at a good resolution has > 3 million degrees of freedom, so it is not possible to assemble the matrix. Also, in the case of shared memory parallelism using Base.threads, how do I achieve reduction of matrices (as shown in the function mul!)? atomic_add works only with integers so I don’t know how to use shared parallelism in this case.

Thanks a lot for your help.

To get the syntax right, the documentation will be most useful To do parallel assembly of the result vectors, you can need to make sure to partition your mesh using the coloring algorithm to eliminate the need for locks without creating a race condition.

1 Like

Thanks a lot. I will try the coloring algorithm.