# JuAFEM + ForwardDiff + NLsolve = Finite elements for nonlinear problems?

Hi,

I’m in the process of solving a Landau theory model concerning a crystal where strain and electric polarization are coupled. I’m trying to minimize the total free energy given by:

``````F = Fl(P, α) + Fc(ε, C) + Fq(P, ε, q) + Fg(∇P, G)
function Fl(P::JuAFEM.Vec{3}, α)
P² = P.^2
P⁴ = P.^4
return α[1, 1, 1] * sum(P²)
+ α[1, 1, 1] * sum(P⁴)
+ α[1, 1, 2] * sum(P.^6)
+ α[1, 2, 1] * (P²[1]*P²[2] + P²[2]*P²[3] + P²[1]*P²[3])
+ α[1, 1, 2] * (P⁴[1]*(P²[2] + P²[3]) + P⁴[2]*(P²[1] + P²[3]) + P⁴[3]*(P²[1] + P²[2]))
+ α[1, 2, 3] * prod(P²)
end

Fc(ε::Tensor{2, 3}, C::Tensor{4, 3}) = 0.5ε ⊡ C ⊡ ɛ

Fq(P::Vec{3}, ε::Tensor{2, 3}, q::Tensor{4, 3}) = -P ⋅ (ε ⊡ q) ⋅ P

Fg(∇P::Tensor{2, 3}, G::Tensor{4, 3}) = 0.5∇P ⊡ G ⊡ ∇P
``````

`ε` is the gradient of the displacement field in all directions, `P` is the polarization, `C` gives the penalty for creating strain, `q` the coupling between strain and polarization `P`, `G` the Ginzburg coefficients for spatial variation of `P` and `α` the different Landau components for free energy from `P`.

Since we are dealing with 6 variables `u_1,2,3, P_1,2,3` in 2D, on a mesh of size `> 2000x600`, throwing it into `Optim.jl` and brute force optimize it is pretty much impossible. Next option is to analytically minimize the free energy with regards to the two fields (`u`, `P`) and solve the equations on a FEM model. This results in a system of nonlinear equations.

Now we come to the questions. Looking at the examples of `JuAFEM` I can almost understand how to solve a nonlinear model, but since I’m not really an expert on FEM I don’t fully understand how to connect `JuAFEM` to a nonlinear solver like `NLsolve`, is this possible? And if so, how does one go about it? The second question is more about how to use `ForwardDiff`, can I exploit this package to not have to analytically compute the derivatives of `F`, which result in the nonlinear equations? This would be a great aid, because the derivation is pretty cumbersome and prone to errors. The main thing I don’t fully understand is if it’s possible to derive the `Fq` equation, which couples the two fields, through the gradient `ε`. From what I read, `ForwardDiff` only works when a function can be called with one `Vector` input or basically 1 parameter, can I use a tuple to derive this? Will that impact performance?

Is there another suite of packages I should have a look at, or that encompass everything I want to do?

Cheers

2 Likes

Nope, this is the combination I would recommend.

Take your PDE, move all of the terms to one side, and then if it’s time-independent you get an `F(x)=0` function, and the discretized form via finite elements is then the one that you want to find the zero of. You just need to put all of the variables you want to solve for into one array.

NLsolve.jl will do this for you. I think you may have to set `autodiff=true` or something like that, but that’s it.

2 Likes

I don’t really understand your problem: why is it intractable to put into Optim, but tractable to solve it as a nonlinear system of equations? It’s usually faster for large-scale systems to minimize E(x) by first-order methods than to solve grad E = 0. Am I missing something?

1 Like

Well, I tried this. Using `JuAFEM` to provide the gradients of `u` and `P` on a discretized grid of (100, 50, 2), and then using the above equations to calculate the total free energy, I arrive at a calculation that takes on the order of 0.1s (if I inject a test configuration of `u` and `P`). However when I then let `Optim.jl` try to optimize it, I arrive at something that takes on the order of days, mainly because it creates `94.4GB` worth of data for some reason. Ultimately the grid should be around (2000,600,2) preferably even bigger.

Is everything type-stable and non-allocating? Sounds like this is a performance and scaling issue, not so much a methodology issue (since BFGS is probably what you want).

Is it that Optim.jl tries to build a dense N x N matrix where N=2000*600*2 \approx 10^6? For problems like this matrix-free (Krylov) methods are probably essential.

1 Like

You’re right about it not being type stable, I expected it to not matter too much since one evaluation only took 0.2s. However, there is 500mb being allocated. Does `Optim` do a lot of iterations that it stores behind the scenes?

So I used `JuAFEM` to do discretization because I was a bit lazy. This resulted in the following code:

``````function f(dh, up::Vector{T}, cvu, cvP, α, C, q, G) where T
out = 0.0
up_ = reinterpret(JuAFEM.Vec{3, T}, up)
uel = fill(zero(Vec{3,Float64}), 4)
Pel = fill(zero(Vec{3,Float64}), 4)
for cell in CellIterator(dh)
reinit!(cvu, cell)
reinit!(cvP, cell)
globaldofs = celldofs(cell)
uel = [up_[div(globaldofs[i],3)+1] for i=1:3:12]
Pel = [up_[div(globaldofs[i],3)+1] for i=13:3:24]
δP = function_value(cvP, qp, Pel)

out += Fl(δP, α) + Fc(δε, C) + Fq(δP, δε, q) + Fg(∇P, G)
end
end
return out
end
``````

What is the worst point is for sure where I try to get the input `up` vector containing all the degrees of freedom that Optim is optimizing, into a format that can be used by `JuAFEM` for the gradients. Is it possible to store this directly into the fields of the `DofHandler` class of `JuAFEM` at each iteration? I should probably give up on this and write it myself, I’m trying to understand just what is possible using these packages.

Note that you cannot apply AD directly to `f` because `up` has such a high dimension.
The trick is to factor out the contribution from each element and apply AD to that one.
Each element contributions is a R^n → R where n is quite small.
Applying AD to that will give you a gradient (dim R^n) and a Hessian (dim R^(n x n)) that you assemble in standard fashion to the global “residual vector” and “stiffness matrix”. The result of the function is then the potential, gradient and hessian which is what Optim needs to use e.g. Newton’s method.

So your function would look something like (pseudo codeish):

``````"""
"""
function fH!(dh, K::SparseMatrixCSC, f::Vector{T}, up::Vector{T}, cv) where T
Ψ = zero(T)
assembler = start_assemble(K, f)
for cell in CellIterator(dh)
globaldofs = celldofs(cell)
u  = up[globaldofs]
element_closure(u) = element_potential(dh, u, cv)
Ψe, fe, Ke = ForwardDiff.Hessian(element_closures, u)
Ψ += Ψ_e
assemble!(assembler, global_dofs, fe, Ke)
end
return Ψ
end
``````
1 Like

I’m trying this out, however I get an error while evaluating the hessian, stating that `one` is only defined for 2nd or 4th order tensors.

``````function fH!(dh, K::SparseMatrixCSC, f::Vector{T}, up::Vector{T}, cvu, cvP, α, C, q, G) where T
out = zero(T)
assembler = start_assemble(K, f)
up_ = reinterpret(JuAFEM.Vec{3, T}, up)
uel = fill(zero(Vec{3,T}), 8)
# Pel = fill(zero(Vec{3,T}), 4)
globaldofs = Vector{Int}(24)
for cell in CellIterator(dh)
reinit!(cvu, cell)
reinit!(cvP, cell)
globaldofs[:] = celldofs(cell)
uel[:] = [up_[div(globaldofs[i],3)+1] for i=1:3:24]
element_closure(u) = element_potential(globaldofs, cvP, cvU, qp, u, α, C, q, G)
oute, fe, Ke = ForwardDiff.hessian(element_closure, uel)
out += oute
assemble!(assembler, globaldofs, fe, Ke)
end
return out
end
function element_potential(globaldofs, cvP, cvU, qp, u::Vector{Vec{3, T}}, α, C, q, G) where T
out = zero(T)
uel = u[1:4]
Pel = u[5:8]
δP = function_value(cvP, qp, Pel)

out += Fl(δP, α) + Fc(δε, C) + Fq(δP, δε, q) + Fg(∇P, G)
end
out
end
``````

Btw I’m sorry if I’m polluting this thread with so much garbage code. If you want me to slim it down to mwe, I can do it in an edit.

This is pretty spot on. It’s trying to solve a dense linear system of this size every step of the optimizer/rootfinder. You probably want to make sure it’s using a preconditioner iterative method from IteativeSolvers.jl or a sparse factorization instead.

Tweaking the solver routine in Optim is attacking the wrong problem. What needs to be changed is the input to Optim, it needs to be a function that computes the sparse Hessian (optionally using AD for the element contributions).

`element_closure` needs to be R^n → R so you need to convert to `Vec` inside the element routine.

Also I don’t think the `oute, fe, Ke = ForwardDiff.hessian(element_closure, uel)` syntax works, it was just psuedocode. I think you need to use the “lower order result” API in ForwardDiff: Advanced Usage Guide · ForwardDiff.

Can Optim even use a sparse Hessian though? I don’t think it can.

Not sure. I always solve things like this in NLSolve by just giving it the residual and tangent. The Newton’s method and trust region method in NLSolve work well with sparse matrices.

Yeah, NLSolve.jl has this, but I don’t think Optim ever copied that over. @pkofod could clear that up.

1 Like

If that is the case, then abandon all hope in using Optim for this :D. Sparse matrix support is a must have for this use case.

I’ll try to make a tutorial/example for this use case (potential solved with AD + NLSolve/Optim) which combines many nice things of Julia.

4 Likes

For Newton, it looks like you can directly give the Hessian type here:

Then this looks sparse safe:

I’ve never seen an example of it, but it looks like Optim’s Newton could probably do this. So…

Very nice help so far, thanks a lot! The memory issue is solved using this, I tested it using `ConjugateGradient()` from `Optim` using a `OnceDifferentiable` providing `fH!` and the `f`. However when I try to create a `TwiceDifferentiable` providing `fH!`, `f` and `K`, I run into an `OutOfMemory()` error. What should be `g!` and `gh!` in the constructor of `TwiceDifferentiable`?