Zygote @adjoint with matrices

theogf · December 12, 2019, 10:32am

Hi!

I would like to use Zygote, because of the amazing possibility to tag the parameters you want to optimize and let Zygote do the rest.
However in my case I am creating a matrix (K) given a (potentially highly nested) list of parameters (theta) and passing them to a function (f) to get a scalar.
I have derived analytically the gradient df/dtheta = g(dK/dtheta) which is non-linear (and contains a share of optimization tricks) and Zygote works perfectly for differentiating K. My initial solution was to compute dK/dp, for each theta and pass it to df/dtheta but it is very inefficient/unpractical.
Now I want to write an @adjoint that would contain g(K) but I have no idea how to go about it since it’s not a jacobian-vector product anymore…

Here is a simple example with the derivations

How can I write an appropriate adjoint for this?

simeonschaub · December 12, 2019, 1:25pm

In Zygote, the pullback, which maps the previous jacobian to the new jacobian, is just an arbitrary function, so it doesn’t necessarily have to be a jacobian-vector product. It should be as easy as:

@adjoint f(K) = f(K), J -> (g(J),)

theogf · December 12, 2019, 2:27pm

The problem is that when I do this J is a scalar.

simeonschaub · December 12, 2019, 2:39pm

You’re right, Zygote does reverse-mode differentiation, so the argument to the pullback of f is actually df/df, which is just one. In your case, I would suggest looking into forward-mode AD using ForwardDiff instead because it should be much more efficient for differentiating K and it will be easier to implement this custom adjoint for f.

theogf · December 12, 2019, 2:57pm

The only problem is that I need the implicit differentiation of Zygote
I need to rely on Zygote.params

simeonschaub · December 12, 2019, 3:02pm

You can use ForwardDiff within Zygote with the function forwarddiff. See also here in the Zygote docs.

theogf · December 12, 2019, 3:17pm

Thanks but this is not compatible with the Params approach

theogf · December 14, 2019, 1:41pm

I finally found a work around!
Since my concern was optimizing the gradients computations (avoiding precomputed inverses etc), I simply wrote a new function whose gradient is equivalent to the one I want!
It slightly less efficient but works pretty well for now!

Topic		Replies	Views
Zygote AD Jacobian-vector product Specific Domains differentiation , zygote	5	1445	June 7, 2020
How do I customize the derivative of a matrix using Zygote: @adjoint General Usage	2	378	August 5, 2022
How to deal with Zygote sometimes "pirating" its own adjoints with worse ones? General Usage	3	659	December 24, 2019
Issue with Zygote over ForwardDiff.derivative Machine Learning	10	1246	January 21, 2024
How to make Zygote not use its default @adjoint for transposition in my custom matrix representation General Usage	3	823	April 12, 2020

Zygote @adjoint with matrices

Related topics