JuAFEM + ForwardDiff + NLsolve = Finite elements for nonlinear problems?

It’s stupid, but this should show how it works


f(x) = sum(x->x.^2, x)

g(G, x) = copy!(G, 2.*x)

h(H, x) = H .= speye(size(H)...).*2

obj_dense = TwiceDifferentiable(f, g, h, rand(40))

obj_sparse = TwiceDifferentiable(f, g, h, rand(40), 0.0, rand(40), speye(40))

res_dense = optimize(obj_dense, rand(40), Newton())

res_sparse = optimize(obj_sparse, rand(40), Newton())

which gives

julia> @time res_dense = optimize(obj_dense, rand(40), Newton())
  0.000454 seconds (319 allocations: 22.359 KiB)
Results of Optimization Algorithm
 * Algorithm: Newton's Method
 * Starting Point: [0.5226916363931,0.43328193595961206, ...]
 * Minimizer: [0.0,0.0, ...]
 * Minimum: 9.398538e-32
 * Iterations: 1
 * Convergence: true
   * |x - x'| ≤ 1.0e-32: false 
     |x - x'| = 8.65e-01 
   * |f(x) - f(x')| ≤ 1.0e-32 |f(x)|: false
     |f(x) - f(x')| = 1.14e+32 |f(x)|
   * |g(x)| ≤ 1.0e-08: true 
     |g(x)| = 2.22e-16 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 20
 * Gradient Calls: 20
 * Hessian Calls: 5

julia> @time res_sparse = optimize(obj_sparse, rand(40), Newton())
  0.000163 seconds (125 allocations: 11.563 KiB)
Results of Optimization Algorithm
 * Algorithm: Newton's Method
 * Starting Point: [0.25005673247377125,0.2988072844582632, ...]
 * Minimizer: [0.0,0.0, ...]
 * Minimum: 0.000000e+00
 * Iterations: 1
 * Convergence: true
   * |x - x'| ≤ 1.0e-32: false 
     |x - x'| = 9.99e-01 
   * |f(x) - f(x')| ≤ 1.0e-32 |f(x)|: false
     |f(x) - f(x')| = Inf |f(x)|
   * |g(x)| ≤ 1.0e-08: true 
     |g(x)| = 0.00e+00 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 8
 * Gradient Calls: 8
 * Hessian Calls: 4

if you have any good examples, I’m happy to make a notebook out of it.

The tricky part is that to use custom Hessian storage you have to use the full constructor

TwiceDifferentiable(f, g, h, initial_x, F, G, H)

where F, G, H are of the type and shapes that f spits out, and that g and h expects as their first input. We could make constructors with keywords such that only H is supplied if people want it.

Oh, This largely solves my question up above :smile:

So basically g, h functions are dummies right?

I don’t know what dummies means in this context.

I mean since I’m constructing the hessian and gradient matrix in-place myself, during the loop, I would want to just pass them to the algorithm.

So you have only one function that does all the calculations?

Yes, it goes through a JuAFEM discretized grid, calculating the hessian and gradient of the objective function for each element specific, then putting it into a sparse hessian matrix and gradient vector.

Okay, this is not possible in the tagged version right now, but just the other day I added this to NLSolversBase, https://github.com/JuliaNLSolvers/NLSolversBase.jl/pull/50 but I see that I didn’t think about specifying the Hessian cache there. I’ll add it. The api would then be

TwiceDifferentiable(only_fgh!(fgh!), x0, F, G, H)

Sounds great! I’ll be waiting for it :smiley:

Working on it, just have to sort out some dispatch related issues :slight_smile:

1 Like

Alright, this works now (locally), I’ll get it merged soon and tag as soon as possible after that. Keep an eye on NLSolversBase.jl and see when the PR is merged, and then you can use master until it gets tagged.

julia> using Optim

julia> f(x) = sum(x->x.^2, x)
f (generic function with 1 method)

julia> g(G, x) = copy!(G, 2.*x)
g (generic function with 1 method)

julia> h(H, x) = H .= speye(size(H)...).*2
h (generic function with 1 method)

julia> obj_dense = TwiceDifferentiable(f, g, h, rand(40))
NLSolversBase.TwiceDifferentiable{Float64,Array{Float64,1},Array{Float64,2},Array{Float64,1}}(f, g, NLSolversBase.fg!, h, 0.0, [4.94066e-324, 9.88131e-324, 1.4822e-323, 1.97626e-323, 2.47033e-323, 2.96439e-323, 3.45846e-323, 3.95253e-323, 4.44659e-323, 4.94066e-323  …  1.5316e-322, 1.58101e-322, 1.63042e-322, 1.67982e-322, 1.72923e-322, 1.77864e-322, 1.82804e-322, 1.87745e-322, 1.92686e-322, 1.97626e-322], [NaN NaN … NaN NaN; NaN NaN … NaN NaN; … ; NaN NaN … NaN NaN; NaN NaN … NaN NaN], [NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN  …  NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN], [NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN  …  NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN], [NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN  …  NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN], [0], [0], [0])

julia> obj_sparse = TwiceDifferentiable(f, g, h, rand(40), 0.0, rand(40), speye(40))
NLSolversBase.TwiceDifferentiable{Float64,Array{Float64,1},SparseMatrixCSC{Float64,Int64},Array{Float64,1}}(f, g, NLSolversBase.fg!, h, 0.0, [-2.0, -2.0, -2.0, -2.0, -2.0, -2.0, -2.0, -2.0, -2.0, -2.0  …  -2.0, -2.0, -2.0, -2.0, -2.0, -2.0, -2.0, -2.0, -2.0, -2.0], 
  [1 ,  1]  =  1.0
  [2 ,  2]  =  1.0
  [3 ,  3]  =  1.0
  [4 ,  4]  =  1.0
  [5 ,  5]  =  1.0
  [6 ,  6]  =  1.0
  [7 ,  7]  =  1.0
  [8 ,  8]  =  1.0
  [9 ,  9]  =  1.0
  [10, 10]  =  1.0
  [11, 11]  =  1.0
  [12, 12]  =  1.0
  [13, 13]  =  1.0
  [14, 14]  =  1.0
  [15, 15]  =  1.0
  [16, 16]  =  1.0
  [17, 17]  =  1.0
  [18, 18]  =  1.0
  [19, 19]  =  1.0
  [20, 20]  =  1.0
  [21, 21]  =  1.0
  [22, 22]  =  1.0
  [23, 23]  =  1.0
  [24, 24]  =  1.0
  [25, 25]  =  1.0
  [26, 26]  =  1.0
  [27, 27]  =  1.0
  [28, 28]  =  1.0
  [29, 29]  =  1.0
  [30, 30]  =  1.0
  [31, 31]  =  1.0
  [32, 32]  =  1.0
  [33, 33]  =  1.0
  [34, 34]  =  1.0
  [35, 35]  =  1.0
  [36, 36]  =  1.0
  [37, 37]  =  1.0
  [38, 38]  =  1.0
  [39, 39]  =  1.0
  [40, 40]  =  1.0, [NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN  …  NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN], [NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN  …  NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN], [NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN  …  NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN], [0], [0], [0])

julia> res_dense = optimize(obj_dense, rand(40), Newton())
Results of Optimization Algorithm
 * Algorithm: Newton's Method
 * Starting Point: [0.7352524500755169,0.32788238152886784, ...]
 * Minimizer: [1.1102230246251565e-16,0.0, ...]
 * Minimum: 5.315868e-32
 * Iterations: 1
 * Convergence: true
   * |x - x'| ≤ 1.0e-32: false 
     |x - x'| = 1.00e+00 
   * |f(x) - f(x')| ≤ 1.0e-32 |f(x)|: false
     |f(x) - f(x')| = 2.51e+32 |f(x)|
   * |g(x)| ≤ 1.0e-08: true 
     |g(x)| = 2.22e-16 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 4
 * Gradient Calls: 4
 * Hessian Calls: 1

julia> res_sparse = optimize(obj_sparse, rand(40), Newton())
Results of Optimization Algorithm
 * Algorithm: Newton's Method
 * Starting Point: [0.11131455534747459,0.08998281236808703, ...]
 * Minimizer: [0.0,0.0, ...]
 * Minimum: 0.000000e+00
 * Iterations: 1
 * Convergence: true
   * |x - x'| ≤ 1.0e-32: false 
     |x - x'| = 9.98e-01 
   * |f(x) - f(x')| ≤ 1.0e-32 |f(x)|: false
     |f(x) - f(x')| = Inf |f(x)|
   * |g(x)| ≤ 1.0e-08: true 
     |g(x)| = 0.00e+00 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 2
 * Gradient Calls: 2
 * Hessian Calls: 1

julia> function fgh!(F, G, H, x)
       
           if !(F == nothing)
               fx = sum(x->x.^2, x)
           end
       
           if !(G == nothing)
               copy!(G, 2.*x)
           end
           if !(H == nothing)
               H .= speye(size(H)...).*2
           end
           return fx
       end
fgh! (generic function with 1 method)

julia> obj_fgh = TwiceDifferentiable(NLSolversBase.only_fgh!(fgh!), rand(40), 0.0, rand(40), speye(40))
NLSolversBase.TwiceDifferentiable{Float64,Array{Float64,1},SparseMatrixCSC{Float64,Int64},Array{Float64,1}}(NLSolversBase.#33, NLSolversBase.#34, NLSolversBase.#35, NLSolversBase.#36, 0.0, [6.90463e-310, 6.90463e-310, 6.90463e-310, 6.90463e-310, 6.90463e-310, 6.90463e-310, 6.90463e-310, 6.90463e-310, 6.90463e-310, 6.90463e-310  …  6.90463e-310, 6.90463e-310, 6.90463e-310, 6.90463e-310, 6.90463e-310, 6.90463e-310, 6.90463e-310, 6.90463e-310, 6.90463e-310, -1.11022e-16], 
  [1 ,  1]  =  1.0
  [2 ,  2]  =  1.0
  [3 ,  3]  =  1.0
  [4 ,  4]  =  1.0
  [5 ,  5]  =  1.0
  [6 ,  6]  =  1.0
  [7 ,  7]  =  1.0
  [8 ,  8]  =  1.0
  [9 ,  9]  =  1.0
  [10, 10]  =  1.0
  [11, 11]  =  1.0
  [12, 12]  =  1.0
  [13, 13]  =  1.0
  [14, 14]  =  1.0
  [15, 15]  =  1.0
  [16, 16]  =  1.0
  [17, 17]  =  1.0
  [18, 18]  =  1.0
  [19, 19]  =  1.0
  [20, 20]  =  1.0
  [21, 21]  =  1.0
  [22, 22]  =  1.0
  [23, 23]  =  1.0
  [24, 24]  =  1.0
  [25, 25]  =  1.0
  [26, 26]  =  1.0
  [27, 27]  =  1.0
  [28, 28]  =  1.0
  [29, 29]  =  1.0
  [30, 30]  =  1.0
  [31, 31]  =  1.0
  [32, 32]  =  1.0
  [33, 33]  =  1.0
  [34, 34]  =  1.0
  [35, 35]  =  1.0
  [36, 36]  =  1.0
  [37, 37]  =  1.0
  [38, 38]  =  1.0
  [39, 39]  =  1.0
  [40, 40]  =  1.0, [NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN  …  NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN], [NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN  …  NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN], [NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN  …  NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN], [0], [0], [0])

julia> res_fgh = optimize(obj_fgh, rand(40), Newton())
Results of Optimization Algorithm
 * Algorithm: Newton's Method
 * Starting Point: [0.3736553645388738,0.40889116577184415, ...]
 * Minimizer: [0.0,0.0, ...]
 * Minimum: 0.000000e+00
 * Iterations: 1
 * Convergence: true
   * |x - x'| ≤ 1.0e-32: false 
     |x - x'| = 1.00e+00 
   * |f(x) - f(x')| ≤ 1.0e-32 |f(x)|: false
     |f(x) - f(x')| = Inf |f(x)|
   * |g(x)| ≤ 1.0e-08: true 
     |g(x)| = 0.00e+00 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 2
 * Gradient Calls: 2
 * Hessian Calls: 1

julia> obj_fgh.H
40×40 SparseMatrixCSC{Float64,Int64} with 40 stored entries:
  [1 ,  1]  =  2.0
  [2 ,  2]  =  2.0
  [3 ,  3]  =  2.0
  [4 ,  4]  =  2.0
  [5 ,  5]  =  2.0
  [6 ,  6]  =  2.0
  [7 ,  7]  =  2.0
  [8 ,  8]  =  2.0
  [9 ,  9]  =  2.0
  [10, 10]  =  2.0
  [11, 11]  =  2.0
  [12, 12]  =  2.0
  [13, 13]  =  2.0
  [14, 14]  =  2.0
  [15, 15]  =  2.0
  [16, 16]  =  2.0
  [17, 17]  =  2.0
  [18, 18]  =  2.0
  [19, 19]  =  2.0
  [20, 20]  =  2.0
  [21, 21]  =  2.0
  [22, 22]  =  2.0
  [23, 23]  =  2.0
  [24, 24]  =  2.0
  [25, 25]  =  2.0
  [26, 26]  =  2.0
  [27, 27]  =  2.0
  [28, 28]  =  2.0
  [29, 29]  =  2.0
  [30, 30]  =  2.0
  [31, 31]  =  2.0
  [32, 32]  =  2.0
  [33, 33]  =  2.0
  [34, 34]  =  2.0
  [35, 35]  =  2.0
  [36, 36]  =  2.0
  [37, 37]  =  2.0
  [38, 38]  =  2.0
  [39, 39]  =  2.0
  [40, 40]  =  2.0

Very good, Thanks a ton! And to everyone else involved too! This should bring me on the path to the solution, aside from some optimizations possibly :slight_smile:

3 Likes

I added an example to https://github.com/KristofferC/JuAFEM.jl/blob/kc/ad_nl/examples/hyperelasticity.ipynb where the gradient and hessian is computed using ForwardDiff.

It doesn’t involve solving with NLSolve or Optim yet because we sort of have to be able to hook into the linear solver step (and convergence calculation) to do that properly.

Cool, Thanks! I need to have a look at what this HessianConfig with chunksize etc does, because that’s something that takes a very long time in my particular case (evaluating it the first time inside the loop). I was also wondering if it is possible to make the cell loop threaded? I tried it, but at some point I got an error with “some indices not found”, I didn’t investigate yet why that happened, but I assume it has something to do with the assembly stage. I was succesful in threading the loop over the nquadpoints, which gave quite a reasonable speedup.

About the solving, I’m waiting for the Optim.jl push where I can give it the caches and the function, then it should Just Work™ ?

There is apparently a significant compilation cost associated with using AD for computing the Hessians. Perhaps using a smaller chunk size can help a bit with that.

I’m using your method of solving, just like in the example. The cg is complaining a lot about the matrix not being positive definite, however when I do issymmetric(K) this returns true and also all the eigenvalues are positive. Is this due to too large values inside the matrix K or inside f? I’m sorry for the bombardment of random questions, I’m very new to all this :stuck_out_tongue:

You can try use a direct solver instead. Just K \ f.

I think it must be specific to my problem or something, when I do that I get that it has one or more zero pivots. I tried doing lufact(K) \ f and that leads to singular error.

Did you make sure to handle boundary conditions, e.g. for mechanics you need to remove rigid body motion (with Dirichlet boundaries day conditions) or the matrix will have zero energy modes.

Ah I see, it basically means that it no well defined solution? Fixing boundary conditions (dirichlet) worked when I include 3 of the 4 terms above, the term Fq connecting P and e is still causing some trouble, but that probably also can be fixed by including the correct boundary conditions. Thanks so much for your help, I learned a lot!

Can you explain this briefly in an issue so we can track it? Would be very helpful (especially, what sort of custom convergence check would you need?)