JuAFEM + ForwardDiff + NLsolve = Finite elements for nonlinear problems?



I’m in the process of solving a Landau theory model concerning a crystal where strain and electric polarization are coupled. I’m trying to minimize the total free energy given by:

F = Fl(P, α) + Fc(ε, C) + Fq(P, ε, q) + Fg(∇P, G)
function Fl(P::JuAFEM.Vec{3}, α)
    P² = P.^2
    P⁴ = P.^4
    return α[1, 1, 1] * sum(P²)
         + α[1, 1, 1] * sum(P⁴)
         + α[1, 1, 2] * sum(P.^6)
         + α[1, 2, 1] * (P²[1]*P²[2] + P²[2]*P²[3] + P²[1]*P²[3])
         + α[1, 1, 2] * (P⁴[1]*(P²[2] + P²[3]) + P⁴[2]*(P²[1] + P²[3]) + P⁴[3]*(P²[1] + P²[2]))
         + α[1, 2, 3] * prod(P²)

Fc(ε::Tensor{2, 3}, C::Tensor{4, 3}) = 0.5ε ⊡ C ⊡ ɛ

Fq(P::Vec{3}, ε::Tensor{2, 3}, q::Tensor{4, 3}) = -P ⋅ (ε ⊡ q) ⋅ P

Fg(∇P::Tensor{2, 3}, G::Tensor{4, 3}) = 0.5∇P ⊡ G ⊡ ∇P

ε is the gradient of the displacement field in all directions, P is the polarization, C gives the penalty for creating strain, q the coupling between strain and polarization P, G the Ginzburg coefficients for spatial variation of P and α the different Landau components for free energy from P.

Since we are dealing with 6 variables u_1,2,3, P_1,2,3 in 2D, on a mesh of size > 2000x600, throwing it into Optim.jl and brute force optimize it is pretty much impossible. Next option is to analytically minimize the free energy with regards to the two fields (u, P) and solve the equations on a FEM model. This results in a system of nonlinear equations.

Now we come to the questions. Looking at the examples of JuAFEM I can almost understand how to solve a nonlinear model, but since I’m not really an expert on FEM I don’t fully understand how to connect JuAFEM to a nonlinear solver like NLsolve, is this possible? And if so, how does one go about it? The second question is more about how to use ForwardDiff, can I exploit this package to not have to analytically compute the derivatives of F, which result in the nonlinear equations? This would be a great aid, because the derivation is pretty cumbersome and prone to errors. The main thing I don’t fully understand is if it’s possible to derive the Fq equation, which couples the two fields, through the gradient ε. From what I read, ForwardDiff only works when a function can be called with one Vector input or basically 1 parameter, can I use a tuple to derive this? Will that impact performance?

Is there another suite of packages I should have a look at, or that encompass everything I want to do?

Thanks in advance for any comments/ help!


Nope, this is the combination I would recommend.

Take your PDE, move all of the terms to one side, and then if it’s time-independent you get an F(x)=0 function, and the discretized form via finite elements is then the one that you want to find the zero of. You just need to put all of the variables you want to solve for into one array.

NLsolve.jl will do this for you. I think you may have to set autodiff=true or something like that, but that’s it.


I don’t really understand your problem: why is it intractable to put into Optim, but tractable to solve it as a nonlinear system of equations? It’s usually faster for large-scale systems to minimize E(x) by first-order methods than to solve grad E = 0. Am I missing something?


Well, I tried this. Using JuAFEM to provide the gradients of u and P on a discretized grid of (100, 50, 2), and then using the above equations to calculate the total free energy, I arrive at a calculation that takes on the order of 0.1s (if I inject a test configuration of u and P). However when I then let Optim.jl try to optimize it, I arrive at something that takes on the order of days, mainly because it creates 94.4GB worth of data for some reason. Ultimately the grid should be around (2000,600,2) preferably even bigger.


Is everything type-stable and non-allocating? Sounds like this is a performance and scaling issue, not so much a methodology issue (since BFGS is probably what you want).


Is it that Optim.jl tries to build a dense N x N matrix where N=2000*600*2 \approx 10^6? For problems like this matrix-free (Krylov) methods are probably essential.


You’re right about it not being type stable, I expected it to not matter too much since one evaluation only took 0.2s. However, there is 500mb being allocated. Does Optim do a lot of iterations that it stores behind the scenes?

So I used JuAFEM to do discretization because I was a bit lazy. This resulted in the following code:

function f(dh, up::Vector{T}, cvu, cvP, α, C, q, G) where T
    out = 0.0
    up_ = reinterpret(JuAFEM.Vec{3, T}, up)
    uel = fill(zero(Vec{3,Float64}), 4)
    Pel = fill(zero(Vec{3,Float64}), 4)
    for cell in CellIterator(dh)
        reinit!(cvu, cell)
        reinit!(cvP, cell)
        globaldofs = celldofs(cell)
        uel = [up_[div(globaldofs[i],3)+1] for i=1:3:12]
        Pel = [up_[div(globaldofs[i],3)+1] for i=13:3:24]
        for qp in 1:getnquadpoints(cvu)
            ∇P = function_gradient(cvP, qp, Pel)
            δP = function_value(cvP, qp, Pel)
            δε = function_gradient(cvu, qp, uel)

            out += Fl(δP, α) + Fc(δε, C) + Fq(δP, δε, q) + Fg(∇P, G)
    return out

What is the worst point is for sure where I try to get the input up vector containing all the degrees of freedom that Optim is optimizing, into a format that can be used by JuAFEM for the gradients. Is it possible to store this directly into the fields of the DofHandler class of JuAFEM at each iteration? I should probably give up on this and write it myself, I’m trying to understand just what is possible using these packages.


We should write a tutorial about this, it is definitely possible to do and quite efficient.

Note that you cannot apply AD directly to f because up has such a high dimension.
The trick is to factor out the contribution from each element and apply AD to that one.
Each element contributions is a R^n -> R where n is quite small.
Applying AD to that will give you a gradient (dim R^n) and a Hessian (dim R^(n x n)) that you assemble in standard fashion to the global “residual vector” and “stiffness matrix”. The result of the function is then the potential, gradient and hessian which is what Optim needs to use e.g. Newton’s method.

So your function would look something like (pseudo codeish):

Computes the potential and updates the hessian K and gradient f
function fH!(dh, K::SparseMatrixCSC, f::Vector{T}, up::Vector{T}, cv) where T
    Ψ = zero(T)
    assembler = start_assemble(K, f)
    for cell in CellIterator(dh)
        globaldofs = celldofs(cell)
        u  = up[globaldofs]
        element_closure(u) = element_potential(dh, u, cv) 
        Ψe, fe, Ke = ForwardDiff.Hessian(element_closures, u)
        Ψ += Ψ_e
        assemble!(assembler, global_dofs, fe, Ke)
    return Ψ


I’m trying this out, however I get an error while evaluating the hessian, stating that one is only defined for 2nd or 4th order tensors.

function fH!(dh, K::SparseMatrixCSC, f::Vector{T}, up::Vector{T}, cvu, cvP, α, C, q, G) where T
    out = zero(T)
    assembler = start_assemble(K, f)
    up_ = reinterpret(JuAFEM.Vec{3, T}, up)
    uel = fill(zero(Vec{3,T}), 8)
    # Pel = fill(zero(Vec{3,T}), 4)
    globaldofs = Vector{Int}(24)
    for cell in CellIterator(dh)
        reinit!(cvu, cell)
        reinit!(cvP, cell)
        globaldofs[:] = celldofs(cell)
        uel[:] = [up_[div(globaldofs[i],3)+1] for i=1:3:24]
        element_closure(u) = element_potential(globaldofs, cvP, cvU, qp, u, α, C, q, G)
        oute, fe, Ke = ForwardDiff.hessian(element_closure, uel)
        out += oute
        assemble!(assembler, globaldofs, fe, Ke)
    return out
function element_potential(globaldofs, cvP, cvU, qp, u::Vector{Vec{3, T}}, α, C, q, G) where T
    out = zero(T)
    uel = u[1:4]
    Pel = u[5:8]
    for qp in 1:getnquadpoints(cvu)
        ∇P = function_gradient(cvP, qp, Pel)
        δP = function_value(cvP, qp, Pel)
        δε = function_gradient(cvu, qp, uel)

        out += Fl(δP, α) + Fc(δε, C) + Fq(δP, δε, q) + Fg(∇P, G)

Btw I’m sorry if I’m polluting this thread with so much garbage code. If you want me to slim it down to mwe, I can do it in an edit.


This is pretty spot on. It’s trying to solve a dense linear system of this size every step of the optimizer/rootfinder. You probably want to make sure it’s using a preconditioner iterative method from IteativeSolvers.jl or a sparse factorization instead.


Tweaking the solver routine in Optim is attacking the wrong problem. What needs to be changed is the input to Optim, it needs to be a function that computes the sparse Hessian (optionally using AD for the element contributions).


element_closure needs to be R^n -> R so you need to convert to Vec inside the element routine.

Also I don’t think the oute, fe, Ke = ForwardDiff.hessian(element_closure, uel) syntax works, it was just psuedocode. I think you need to use the “lower order result” API in ForwardDiff: http://www.juliadiff.org/ForwardDiff.jl/latest/user/advanced.html#Retrieving-Lower-Order-Results-1.


Can Optim even use a sparse Hessian though? I don’t think it can.


Not sure. I always solve things like this in NLSolve by just giving it the residual and tangent. The Newton’s method and trust region method in NLSolve work well with sparse matrices.


Yeah, NLSolve.jl has this, but I don’t think Optim ever copied that over. @pkofod could clear that up.


If that is the case, then abandon all hope in using Optim for this :D. Sparse matrix support is a must have for this use case.

I’ll try to make a tutorial/example for this use case (potential solved with AD + NLSolve/Optim) which combines many nice things of Julia.


For Newton, it looks like you can directly give the Hessian type here:

Then this looks sparse safe:

I’ve never seen an example of it, but it looks like Optim’s Newton could probably do this. So…

please do!


a bit late to the game, but yes it should be sparse safe! Next on the agenda is to be able to choose factorization yourself


Any examples?


Very nice help so far, thanks a lot! The memory issue is solved using this, I tested it using ConjugateGradient() from Optim using a OnceDifferentiable providing fH! and the f. However when I try to create a TwiceDifferentiable providing fH!, f and K, I run into an OutOfMemory() error. What should be g! and gh! in the constructor of TwiceDifferentiable?