Using gradients from struct with Zygote

theogf · November 25, 2020, 6:22pm

Hi!

Let’s say I have some nested struct (in my case the nesting is arbitrarily complicated)

struct Foo
a::Vector
end

struct Bar
b::Foo
c::Float64
end

c = Bar(Foo([2.0]), 1.0)

When using

g = Zygote.gradient(c) do x
   do_something(x)...
end

g[1] will be a tuple of the form (c=nothing, b=(a=[...])).

My problem is that I have no idea how to apply this gradient automatically on my object.

I know there is Flux.params to compute the gradients implicitly but it has big disadvantages like getting saving unwanted gradients and also it just fails in a lot of cases for me.

What should I do?

DrChainsaw · November 26, 2020, 6:39am

I guess you are referring to the mechanism for implicit gradients which is a mechanism for when you in advance know exactly which parts of that nested struct you want the gradients for. It should in other words not have the problem of getting unwanted gradients. Flux.params is just Fluxs way of conveniently returning all AbstractArrays found in the nested struct, but afaik it is not tied to that mechanism.

Anyways, the docs (in the same section I linked) recommend to not use that approach so it might be better to work with the output you have got there.

I don’t know what is the best way, but I think you should be able to use getfield to traverse the struct. I have found that Julias multiple dispatch makes it relatively painless to recurse into nested structs. Here is an untested skeleton implementation:

apply_gradient(g::NamedTuple, s) = foreach(pairs(g)) do (fieldname, subgradient)
              apply_gradient(subgradient, getfield(s, fieldname))
end

function apply_gradient(::Nothing, x) end # No gradient -> do nothing

apply_gradient(g::AbstractArray, p::AbstractArray) = g .- p #might want to propagate some policy (e.g. a learning rate) as a third argument

You might need to add a few methods there depending on what one might find in your structure (e.g. if there are arrays or tuples of structs in there).

theogf · November 26, 2020, 12:35pm

Thanks I was exactly looking for something like this

Topic		Replies	Views
Implicit gradients with mutable struct returns an error Machine Learning zygote	1	382	November 17, 2021
How to make Zygote avoid differentiating with respect to some fields in struct Machine Learning question , machine-learning , zygote , struct , autodiff	6	1641	August 2, 2021
Flux.params of a matrix implemented as a struct Machine Learning zygote	11	979	May 17, 2021
Flux/Zygote: Gradient with respect to inputs and implicit parameters (in 2021) Machine Learning question , flux , zygote	1	975	November 23, 2021
Differentiating implicit parameters using Zygote in complex hierarchical models New to Julia question , differentiation , flux	0	938	January 16, 2019

Using gradients from struct with Zygote

Related topics