A hacky guide to using automatic differentiation in nested optimization problems

First of all, if you are solving \min_x \left[ \min_z h(x,z) \right], then you should simply combine it into

\min_{x,z} h(x,z)

i.e. minimize over x and z simultaneously. Assuming h(x,z) is smooth, you can then apply any mullti-variate minimization approach that you want. Adding a constraint g(x,z) = 0, or any other differentiable constraint, is also easy.

In contrast, directly nesting calls to optimize as in your suggestion is probably suboptimal. You will waste a lot of time optimizing the “inner” problem to high precision to get f(x), because you immediately throw away this effort by updating x.

A tricker case to handle is the “minimax” problem

\min_{x\in X} \left[ \max_{z\in Z} h(x,z) \right].

If Z is finite set (or can be discretized/approximated by a finite set), one approach is an “epigraph” formulation, which transforms it to an exactly equivalent problem with differentiable constraints by adding a “dummy” variable t:

\min_{x\in X, t \in \mathbb{R}} t \\ \mbox{subject to } t \ge h(x,z) \mbox{ for } z \in Z

at which point you can apply various nonlinear programming algorithms. (One of my favorites is the CCSA algorithm, which can be found in NLopt; I keep meaning to put together a pure-Julia version of this algorithm, since it is so simple and flexible.) If Z is a continuous set, however, matters are more complicated — there are a variety of “continuous minimax” algorithms proposed in the literature for this case, but libraries are harder to find.

11 Likes