Trying to get my head around AD in Julia and being inspired by Jarret Revels’ example of computing the Jacobian of a cumprod()
function with ForwardDiff.jl
, I tried to see whether I could get the same done with Zygote.jl
. I got an unexpected results with a wrong answer. Maybe someone can help explain where I went wrong or what the limitations of Zygote
are.
First I realised that naive implementations of cumprod()
including Base.cumprod()
are not suitable for Zygote
due to the use of array mutation. So I came up with my own function, which avoids mutating arrays and constructs the cum-product with an array comprehension and an accumulating function.
## accumulating function
mutable struct Accu{T} <: Function
a::T
end
Accu(T::Type) = Accu(one(T))
function (c::Accu)(arg)
c.a *= arg
end
function ac_cumprod(x)
acc = Accu(eltype(x))
return [acc(el) for el in x]
end
It seems to do what it should,
julia> ac_cumprod([1,2,3])
3-element Array{Int64,1}:
1
2
6
and ForwardDiff
computes the Jacobian correctly (and so does ReverseDiff
)
julia> using ForwardDiff
julia> @btime ForwardDiff.jacobian(ac_cumprod, [1,2,3])
442.434 ns (7 allocations: 864 bytes)
3×3 Array{Int64,2}:
1 0 0
2 1 0
6 3 2
In order to compute the Jacobian with Zygote
, I used code I found in pull request #747 (slightly modified):
using Zygote
"""
zy_jacobian(f,x)
Construct the Jacobian of `f` where `x` is a real-valued array
and `f(x)` is also a real-valued array.
"""
function zy_jacobian(f, x)
y, back = Zygote.pullback(f, x)
k = length(y)
n = length(x)
J = Matrix{eltype(y)}(undef, k, n)
e_i = fill!(similar(x), 0)
@inbounds for i = 1:k
e_i[i] = oneunit(eltype(x))
J[i, :] = back(e_i)[1]
e_i[i] = zero(eltype(x))
end
J
end
Using this function to compute the Jacobian of my function, however, yields a wrong result:
julia> @btime zy_jacobian(ac_cumprod, [1,2,3])
76.221 μs (333 allocations: 12.16 KiB)
3×3 Array{Int64,2}:
1 0 0
8 4 2
54 27 18
Moreover, execution is orders of magnitude slower that with ForwardDiff
. So where did things go wrong?