My gradients sometimes appear with keys of type GlobalRef. I do not understand why and I do not understand how to change that.
Example code:
using Flux
function h(x; bs=bs)
for b in bs[1:end-1]
x = x+b
end
x = x+bs[end]
only(x)
end
bs = [ [0.0], [1.0], [2.0] ]
grads = gradient(() -> h(x, bs=bs), params(bs))
@show grads[bs[1]]
@show grads[bs[2]]
@show grads[bs[3]]
for (k, v) in grads.grads
if k isa GlobalRef
@show k
@show v
end
end
The output shows that the grads.grads IdDict contains the expected gradients for b[1] and b[2] indexed by the “arrays” (I am not sure what the terminology is), but that it contains nothing for b[3]. The same IdDict does, however, contain one key which is a GlobalRef to :(Main.bs), and there is contains nothing for the first two elements, but the expected gradient for the last element.
- What should this function look like to give me the expected gradient as
grads[bs[3]]? - What determines if the gradient appears with a
GlobalRefor array as its key in theIdDict, and what do I need to understand to avoid this problem in the future?