If you are doing a bilevel optimization (optimizing a function that itself solves an optimization problem), you can declare your own rrule
(vector–Jacobian product) to tell Zygote how to differentiate it efficiently using the implicit-function theorem. (Basically, you differentiate using the KKT conditions describing your inner optimum.)
In general, AD tools need a bit of “help” whenever the function you are differentiating solves a problem approximately by an iterative method (e.g. Newton iterations for root finding, or iterative optimization algorithms, or adaptive quadrature) — even if AD can analyze the iterations, it will end up wasting a lot of effort trying to exactly differentiate the error in your approximation.
See also Differentiating optimization problem solutions in Julia