Computing function Jacobian from Zygote.pullback

Zygote documentation on custom adjoints states dx/dy (in the notation from the docs) is the Jacobian when x, y are vectors. I am struggling in relating this to an implementation.

For a function f : R^N -> R^M modeled with f = Chain(Dense(N, H, tanh), Dense(H, M)) with hidden dimension H, I intuitively thought dy/dx would be an MxN matrix, but as we all know intuition often fails so I experimented a bit. Running

using Flux, Zygote
N, M, H = 2, 3, 16
g = Chain(Dense(N, H, tanh), Dense(H, M))
x = randn(N, 1)
y, back = Zygote.pullback(g, x)
back([])

gives DimensionMismatch("matrix A has dimensions (16,3), matrix B has dimensions (0,1)") so it looks like dy/dx is an MxH matrix. Okay, cool, I guess this is the first matrix multiplication in the first back call. But do I even get an MxN matrix if I compute the back pass? No, running back(ones(size(y)...)) returns an Nx1 matrix. So how would I compute the “MxN” Jacobian?

I am clearly not thinking about this the right way. Could someone give me some pointers to this problem please? Reading the documentation has unfortunately not made it “click” for me.

Pullback with the basis elements, i.e. [1,0,0,0,0,0], [0,1,0,0,0,0], …

1 Like

You can steal this implementation https://github.com/FluxML/Zygote.jl/issues/98#issuecomment-497739021 or this one https://github.com/FluxML/Zygote.jl/pull/235/files .

1 Like

Thank you for the responses.

So the way to do it is to compute each column in the Jacobian separately? I thought that would be inefficient and that there is a smart way of doing it in a vectorized way out there. I could very well be wrong about this though!

No each row. And if you have a neural network, pulling back the identity matrix will give the Jacobian.

Pulling back the identity matrix will give the Jacobian transpose.

Thank you for your responses, I think I got a good understanding of it now!