How to force Flux to use FiniteDiff

stevengj · February 16, 2022, 10:40pm

If you are doing a bilevel optimization (optimizing a function that itself solves an optimization problem), you can declare your own rrule (vector–Jacobian product) to tell Zygote how to differentiate it efficiently using the implicit-function theorem. (Basically, you differentiate using the KKT conditions describing your inner optimum.)

In general, AD tools need a bit of “help” whenever the function you are differentiating solves a problem approximately by an iterative method (e.g. Newton iterations for root finding, or iterative optimization algorithms, or adaptive quadrature) — even if AD can analyze the iterations, it will end up wasting a lot of effort trying to exactly differentiate the error in your approximation.

See also Differentiating optimization problem solutions in Julia

Topic		Replies	Views
Flux loss: Gradient wrt input leads to empty gradient wrt parameters or to "can't differentiate foreigncall" Machine Learning flux , forwarddiff , diffeqflux	3	558	April 8, 2022
Use ForwardDiff instead of Zygote with Flux? Machine Learning	10	1712	September 3, 2021
Is it possible perform reverse mode differentiation (Flux.jl with Zygote.jl) of a forward mode differentiation result (e.g. ForwardDiff)? Machine Learning question , flux	3	1447	March 10, 2020
Gradient error in Flux model inputs Machine Learning question , flux , zygote	5	1324	January 13, 2021
DiffEqFlux Autodifferentiating inside loss function Modelling & Simulations question , diffeq , sciml	6	603	September 29, 2020

How to force Flux to use FiniteDiff

Related topics