It seems that Zygote silently fails when differentiating a function that takes and outputs CircularArrays , using a Zygote.Buffer
to allow mutation. Here’s a comparison with ForwardDiff:
using Zygote, ForwardDiff, CircularArrays
a = CircularArray([1.0])
function f(x)
new = Zygote.Buffer(x)
new[1] = x[1] + x[2]
return copy(new)
end
display(Zygote.jacobian(f, a)[1])
# 1×1 CircularArray(::Matrix{Float64}):
# 1.0
display(ForwardDiff.jacobian(f, a))
# 1×1 CircularArray(::Matrix{Float64}):
# 2.0
b = [1.0]
function g(x)
new = Zygote.Buffer(x)
new[1] = x[1] + x[1]
return copy(new)
end
display(Zygote.jacobian(g, b)[1])
# 1×1 Matrix{Float64}:
# 2.0
display(ForwardDiff.jacobian(g, b))
# 1×1 Matrix{Float64}:
# 2.0
Is this a known limitation of Zygote ? Given the simplicity of the implementation of CircularArrays, I would’ve expected it to work.
Thank you!
I’m not sure why, but this also happens for OffsetArrays
Sorry, I misunderstand my outputs…
(@v1.10) pkg> st Zygote ForwardDiff OffsetArrays
Status `~/.julia/environments/v1.10/Project.toml`
[f6369f11] ForwardDiff v0.10.36
[6fe1bfb0] OffsetArrays v1.13.0
[e88e6eb3] Zygote v0.6.68
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.10.0 (2023-12-25)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> using Zygote, ForwardDiff, OffsetArrays
julia> function f(x)
new = Zygote.Buffer(x)
new[1] = x[begin] + x[begin]
return copy(new)
end
f (generic function with 1 method)
julia> function f2(x)
new = Zygote.Buffer(x)
return [x[begin] + x[begin]]
end
f2 (generic function with 1 method)
julia> oa1 = OffsetArray([1], 1:1)
1-element OffsetArray(::Vector{Int64}, 1:1) with eltype Int64 with indices 1:1:
1
julia> display(Zygote.jacobian(f, oa1)[begin])
1×1 Matrix{Int64}:
2
julia> display(ForwardDiff.jacobian(f, oa1))
1×1 Matrix{Int64}:
2
julia> display(Zygote.jacobian(f2, oa1)[begin])
1×1 Matrix{Int64}:
2
julia> display(ForwardDiff.jacobian(f2, oa1))
1×1 Matrix{Int64}:
2
@trakjohnson
Adding a method of ∇getindex with CircularArray
type should solve your issue.
"""
∇getindex(x, dy, inds...)
For the `rrule` of `y = x[inds...]`, this function is roughly
`setindex(zero(x), dy, inds...)`, returning the array `dx`.
Differentiable. Includes `ProjectTo(x)(dx)`.
"""
function ∇getindex(x::AbstractArray{T,N}, dy, inds...) where {T,N}
# `to_indices` removes any logical indexing, colons, CartesianIndex etc,
# leaving just Int / AbstractVector of Int
plain_inds = Base.to_indices(x, inds)
dx = if plain_inds isa NTuple{N, Int} && T<:Number
# scalar indexing
OneElement(dy, plain_inds, axes(x))
else # some from slicing (potentially noncontigous)
dx = _setindex_zero(x, dy, plain_inds...)
∇getindex!(dx, dy, plain_inds...)
end
return ProjectTo(x)(dx) # since we have x, may as well do this inside, not in rules
end
julia> import ChainRules
julia> using CircularArrays, Zygote, ForwardDiff
julia> function ChainRules.∇getindex(x::CircularArray, dy, i::Int)
m = eachindex(IndexLinear(), x.data)
ChainRules.∇getindex(x.data, dy, mod(i, m))
end
julia> a = CircularArray([1.0])
1-element CircularVector(::Vector{Float64}):
1.0
julia> function f(x)
new = Zygote.Buffer(x)
new[1] = x[1] + x[2]
return copy(new)
end
f (generic function with 1 method)
julia> display(Zygote.jacobian(f, a)[1])
1×1 CircularArray(::Matrix{Float64}):
2.0
julia> display(ForwardDiff.jacobian(f, a))
1×1 CircularArray(::Matrix{Float64}):
2.0