# Sparse matrix coefficient computation with ForwardDiff and LoopVectorization

Hi everyone,
I am trying to write a code that computes both the residual and sparse matrix coefficients for multi-dimensional non-linear Poisson-type problems. The idea is that information about the stencil and boundary conditions are only present in the functions that evaluate the stencil (residual function + connectivity/sparsity function).
Sparse system coefficients can be computed manually/symbolically but this approach implies that even a tiny change the residual function requires a manual modification to the function that evaluates system coefficients. Using automatic differentiation, one could simply evaluate the system coefficients by computing the gradient of the residual function w.r.t stencil point values. As a result, modifications of the residual function do not necessarily require to modify the function that evaluates system coefficients - very nice. Also, since things can be made fast (using LoopVectorization), the time for the assembly of the system matrix can be kept much smaller than the time required for computing Cholesky factors.
Using the advices provided there I could come up with the MWE below for 1D linear Poisson.
While this script works fine, I would like to make it more general. In particular, I would like to be able to predefine the number of stencil points without having to change any other function than `Poisson1D` and `PoissonSparsity1D`.
For example, if I would want to use the current approach to make a 2D high order discretisation, I could end up with more that 10 stencil points and thus:

1. I need to find an appropriate a data structure to store neighbour values/coefficients (`TC,TW,TE`) and indices (`IC,IW,IE`). Potential options are Vectors of Arrays, multi-dimensional Arrays, Structs, tuples… If anyone has some hints about that, they would be welcome As far as I’ve experienced, the trickiest part is to find which data structure will allow the code to still work with both LoopVectorization and ForwardDiff.

2. The definition and calls to `Poisson_valgrad1D` would require an embarrassingly large number of arguments. What would be the best way to organise the code such that a function like `Poisson_valgrad1D` would not need one argument entry per stencil connection but rather one data structure? This is of course intrinsically linked to point #1.

3. Also one would need to change the way the triplets vectors (`I,J,V`) are assembled. But I guess this can be made nice by using a comprehension over the appropriate data structure (so also linked to point #1).

4. Finally, any other hints or comments are welcome

Cheers!

``````using SIMDDualNumbers, ForwardDiff, StaticArrays, LoopVectorization, SparseArrays, LinearAlgebra, Plots

function PoissonSparsity1D!(TC,TW,TE)
TW[2:end-0] .= TC[1:end-1];
TE[1:end-1] .= TC[2:end-0];
end

function Poisson1D(x::AbstractVector{<:Real}, data, i)
kx, dx, ncx, TW_BC, TE_BC = data
TC, TW, TE = x[1], x[2], x[3]
TW1     = SIMDDualNumbers.ifelse(i==1, 2TW_BC-TC, TW)
TE1     = SIMDDualNumbers.ifelse(i==ncx, 2TE_BC-TC, TE)
qxW     = -kx*(TC - TW1)/dx
qxE     = -kx*(TE1 - TC)/dx
F       = (qxE - qxW)/dx
return F
end

Poisson1D(TC::Real,TW::Real,TE::Real,data,i) = Poisson1D(@SVector([TC,TW,TE]),data,i)  # use multiple dispatch to make it work with the desired argument list
sv = @SVector([v,w,x])
end

function main()

# Initialise
ncx = 10
data = (kx=1.0, dx=0.1, ncx, TW_BC=1.0, TE_BC=2.0)
F, T = fill(0.0, ncx), fill(0.0, ncx)  # residual, solution fields
TC,TW,TE = fill(0.0,  ncx), fill(0.0,   ncx), fill(0.0,   ncx) # working arrays double
IC,IW,IE = fill(  1,  ncx), fill(  1,   ncx), fill(  1,   ncx) # working arrays double
I, J, V  = fill(  0,3*ncx), fill(  0, 3*ncx), fill(0.0, 3*ncx)

# Get neigbour values and generate connectivity
TC .= T
PoissonSparsity1D!(TC,TW,TE)

# Generate connectivity
IC .= collect(1:ncx)
PoissonSparsity1D!(IC,IW,IE)

# Compute Finite Difference coefficients
@turbo for i=1:ncx
F[i], (TC[i], TW[i], TE[i]) = Poisson_valgrad1D(Poisson1D, TC[i], TW[i], TE[i], data,i)
end

# Fill triplets and assemble
I .= [IC[:]; IC[:]; IC[:]]
J .= [IC[:]; IW[:]; IE[:]]
V .= [TC[:]; TW[:]; TE[:]]
@time K  = sparse(I,J,V)
droptol!(K,1e-9)

@show K

# Solve
@time T[:] .-= cholesky(K)\F[:]

# Set central node and get neigbour values
TC .= T
PoissonSparsity1D!(TC,TW,TE)

# Compute Finite Difference coefficients
@turbo for i=1:ncx
F[i] = Poisson1D(TC[i],TW[i],TE[i],data,i)
end
println("r = ", norm(F[:]))

# display(plot(T))
end

for it=1:2
@time main()
end
``````