Jacobian&vectors in Zygote

I have a function with three inputs and two outputs. The first argument is a vector, the two others are scalars. The function is built from smaller functions. I want to get derivatives of the composite function with respect to the second argument (so scalar). For that I am using jacobian from Zygote. However, I get different results from jacobian that I would obtain manually. What’s wrong?

using Zygote

f1 = (u,β) -> [1 -1+3*(u[2].^2)]
f2 = (u,β) -> f1(u,β) .+ β
f3 = (u,β,y) -> [ 3.0*u[1]+u[2] -3+u[1]+2*u[2]-3*(u[2]^2) 1 ]
f4 = (u,β) -> [I f2(u,β)']
fImportant = (u,β,y) -> 2* f4(u,β)*transpose(f3(u,β,y))

ftest =x -> fImportant([0,0],x,0.5)
case1 = jacobian(x->ftest(x),0.0)

fManualtest = (bb)-> [2+2*bb;-8+2*bb]
case2 = jacobian(x->fManualtest(x),0.0)

xx=range(-1,1,1000)
norm(ftest.(xx)-fManualtest.(xx)) ######Checking that the manual function is actually correct

It seems that Zygote had some issues with jacobians but the topic was closed, so I am assuming the issue was solved. I am using Zygote v0.6.71.

Using ForwardDiff works:

using ForwardDiff
ftest =x -> fImportant([0,0],x,0.5)
case1 = ForwardDiff.derivative(x->ftest(x),0.0)

fManualtest = (bb)-> [2+2*bb;-8+2*bb]
case2 = ForwardDiff.derivative(x->fManualtest(x),0.0)

Can you show the results of your code / explain what you mean by “different”? Is it just the shape of the result, or the actual numerical values inside? Or does Zygote error?

Oh, sorry, I meant the values are different:
image

Attempting to narrow this down, f4 seems to be where the problem shows up?

julia> (u,β,y) = [0,0], 0.0, 0.5;

julia> ForwardDiff.derivative(β -> f4(u, β), β)
2×3 Matrix{Float64}:
 0.0  0.0  1.0
 0.0  0.0  1.0

julia> Zygote.jacobian(f4, u, β) |> js -> reshape(js[2], 2,3)
2×3 Matrix{Float64}:
 0.0  1.0  0.0
 0.0  1.0  0.0

Or without this example’s code:

julia> Zygote.jacobian(x -> [I x], [1.0, 2.0])[1]
6×2 Matrix{Float64}:
 0.0  0.0
 0.0  0.0
 1.0  0.0
 0.0  1.0
 0.0  0.0
 0.0  0.0

julia> ForwardDiff.jacobian(x -> [I x], [1.0, 2.0])
6×2 Matrix{Float64}:
 0.0  0.0
 0.0  0.0
 0.0  0.0
 0.0  0.0
 1.0  0.0
 0.0  1.0

julia> [I [1, 2]] == [I(2) [1, 2]] == [1 0 1; 0 1 2]
true

julia> Zygote.jacobian(x -> [I(2) x], [1.0, 2.0])[1]
6×2 Matrix{Float64}:
 0.0  0.0
 0.0  0.0
 0.0  0.0
 0.0  0.0
 1.0  0.0
 0.0  1.0

Changing the code above to f4 = (u,β) -> [I(2) f2(u,β)'] seems to also work. So I assume this is a bug in how Zygote handles hvcat(::UniformScaling, ::AbstractMatrix).

(The linked issue is just a request that someone write a jacobian function at all. Surely there are bugs but that isn’t a sign. Note that the 99% of what Zygote.jacobian does is what Zygote always does – most bugs will also be visible in calls to gradient.)

Edit: the bug is that here ChainRules assumes hcat only acts on numbers & arrays, not I. A quick check is:

julia> using ChainRules, LinearAlgbra

julia> ChainRules._catsize(::UniformScaling) = error("not supported")

julia> Zygote.jacobian(x -> [I x], [1.0, 2.0])[1]
ERROR: not supported
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:44
  [2] _catsize(::UniformScaling{Bool})
    @ Main ./REPL[69]:1
  [3] map
    @ ./tuple.jl:357 [inlined]
  [4] rrule(::typeof(hcat), ::UniformScaling{Bool}, ::Vector{Float64})
    @ ChainRules ~/.julia/packages/ChainRules/vdf7M/src/rulesets/Base/array.jl:228
  [5] rrule(::Zygote.ZygoteRuleConfig{Zygote.Context{false}}, ::Function, ::UniformScaling{Bool}, ::Vector{Float64})
    @ ChainRulesCore ~/.julia/packages/ChainRulesCore/6Pucz/src/rules.jl:138

4 Likes