You’re absolutely right that nested optimization is not the best solution for the example I gave. However, there are problems for which nesting optimizations is probably best. I intended this more as a general method for dealing with nested uses of optimize
when you want to use forward autodiff. In my research, for example (and the impetus for trying to get this to work), I’m trying to solve
where the x, d, and z are all vectors, and n\approx 100. In this case, solving for all the z in a single optimize
is infeasible–I would have an input vector several hundred entries long.
This sort of thing comes up a lot in economics (esp. macro/IO/finance), where you have a low-dimensional set of parameters \theta and a large set of decision-makers i who make decisions z_i based on \theta.