I have a feeling this may have been asked before but I can’t seem to find an answer online. The general question is: how can I compute the gradient of a function where the function depends on the solution of a PDE and the derivatives are taken wrt to the shape of the boundaries of the PDE? The problem, as far as I can tell, is that mesh objects are not differentiable, so I can’t use automatic differentiation to just magic my way through this. And the naive solution I came up with seems like overkill. So I was hoping that maybe someone here could educate me
For concreteness, imagine I wanted to optimize the shape of an object that was falling through a tube of water. For any particular object boundary \vec{\partial\Omega}(\{a_\alpha\}), where \{a_\alpha\} is a set of parameters that define the shape, I should be able to: (1) generate a suitable mesh of the 3D solution volume (still not totally sure how one does this, but let’s avoid the tangent); (2) specify that the velocity evaluated on the surface of the object is a constant vector \vec{v}(\vec{x_i})=\vec{v_o} where \vec{x_i}\, \epsilon\,\vec{\partial\Omega} are the vertices of the surface mesh or something similar; (3) similarly specify that the velocity at the tube walls is exactly zero; (4) solve the hydrodynamics problem using your favorite solver; and (5) finally compute the drag force on the object (the square of which, say, is our objective function).
Naively, I should be able to leave the mesh as is and directly differentiate my PDE wrt to my shape parameters and solve for \partial \vec{v}(\vec{x}, a_\alpha)/\partial a_\alpha by running my PDE solver over and over again for every shape variable. Thus computing the derivatives of the shear stresses and so on. Again, for concreteness, this would mean solving \partial \vec{v}(\vec{x}, a)/\partial a\cdot\nabla\vec{v}(\vec{x})+\vec{v}(\vec{x})\cdot\nabla\partial\vec{v}(\vec{x}, a)/\partial a - \nu\nabla^2\partial \vec{v}(\vec{x}, a)/\partial a=-\frac{1}{\rho}\nabla\partial p(\vec{x}, a)/\partial a
with boundary condition \partial \vec{x_i}/\partial a\cdot \nabla \vec{v}(\vec{x_i}) + \partial \vec{v}(\vec{x_i}, a)/\partial a=0
over and over again for every shape parameter using the \nabla\vec{v}(\vec{x}) and \vec{v}(\vec{x}) obtained from the initial solution.
Mercifully at least these PDEs are all linear. But there could be dozens or hundreds of them (squared and no longer linear if you wanted Hessians). And if I wanted to use this procedure to optimize the shape parameters I would then also have to automate and wrap the meshing operation inside of a function that computed the final objective function and gradients (which also assumes that discrete changes in the mesh from one step to the next don’t introduce non-differentiable noise into the gradient). So, first, is this correct/feasible and, second, is there no better way to do this?
Hey @tmigot, thanks for the reply! I suppose I can use this framework to deal with the outer optimization loop that executes the (possibly constrained) search for the shape parameters, but I don’t immediately see how this helps me compute gradients wrt to changes in the domain shape. I thought the GMSH mesher Gridap uses was an external library that was not susceptible to automatic differentiation magic. Is GridapEmbedded maybe 100% Julia and amenable to AD?
Hi, I am not an expert in shape optimisation methods but my understanding is that you typically don’t need to differentiate the meshing procedure. Locally if you change the outer coordinates of your mesh by an infinitesimal amount, the remaining nodes and connectivity of the mesh can be assumed to be unchanged. This is because a mesh is just a discretisation scheme that helps you approximate your integrals. If you assume that the mesh is good enough even if the outer nodes move a little, then you don’t really need to consider the gradient of the mesh wrt the outer coordinate changes. However, when the mesh stops being good enough due to large deformations in the outer elements then at least partial re-meshing would be necessary. Given this particular definition of the mesh(x) function where x is the outer nodes’ coordinates, the mesh function would be a semi-continuous, “step” function of x with derivatives 0 almost everywhere except at the points of dis-continuity where the derivative is undefined. So you can basically ignore its derivative and hope that the dis-continuity here won’t hurt your convergence metrics. Intuitively, your optimal shape should be largely independent of the specific inner node coordinates of the mesh and the node connectivity for any mesh that gives you a good enough approximation of the objective function. So we are ignoring the derivative of the discretisation error wrt the outer coordinates or shape parameters.
There is lots of literature on shape optimisation that you might want to read before implementing these methods since I am sure there are nuances and other variants which the “naive way” above doesn’t cover. For instance, if your shape is parameterised using only a few geometric shape parameters, then it is possible to efficiently get the gradient of your objective function with finite difference. An autodiff-friendly meshing package would be cool to have and I always wanted to implement one but people have been doing shape optimisation without one for a few decades now. So it is not really necessary. You might also want to look at the field of “topology optimisation” which takes a more general and arguably easier approach than more classical “shape optimisation” methods. PDENLPModels.jl and TopOpt.jl are 2 packages that implement topology optimisation methods. Disclaimer: I am the creator and maintainer of the latter.
Hi @mohamed82008, thanks for the reply. I agree that I should not need to alter the mesh to compute derivatives analytically (though I probably will have to remesh or at least update the mesh every time I pick new shape parameters). But I’m pretty sure that I need to know the direction in which the surface nodes would move in response to an infinitesimal shape perturbation.
But if I have the shape function(s) I should be able to compute its derivatives (automatically or by hand) and convert those into infinitesimal vertex displacement vectors. In my naive implementation those would then be passed (along with the unperturbed velocity and strain rate fields) to the N downstream linear PDE solvers that would compute the gradients wrt the N shape parameters. So I would need a function that takes in the surface vertex coordinates and spits out the shape derivatives at every point. It just seems like a lot of work though.
Your comment about piecewise discontinuous mesh functions did remind me to think about what would happen along piecewise continuous, but not differentiable, shape edges (for example if the object falling down the water pipe was not a ball but a cube). Presumably those would involve line integrals of some kind.
@instant0n I had similar problems that I wanted to model. Usually, I am extending the “discrete” function over the entire domain. Otherwise, it is hard to do in general because it highly depends on the discretization chosen.
Speed with a good question, I don’t know other packages that handle finite-element in optimization problems, so there is no existing benchmark as far as I am aware.
Hey @instant0n ! Sorry for the very late reply, the quick answer after some very long thoughts is I don’t think so… or at least not directly this would require new features, but that’s definitely a desirable one.
Hi @tmigot - Thanks for getting back to me. I realized a while ago that, because my problem was linear, it was straightforward to solve the optimization problem using fundamental solutions. Still, would be a great feature for nonlinear problems!