My gradients sometimes appear with keys of type GlobalRef
. I do not understand why and I do not understand how to change that.
Example code:
using Flux
function h(x; bs=bs)
for b in bs[1:end-1]
x = x+b
end
x = x+bs[end]
only(x)
end
bs = [ [0.0], [1.0], [2.0] ]
grads = gradient(() -> h(x, bs=bs), params(bs))
@show grads[bs[1]]
@show grads[bs[2]]
@show grads[bs[3]]
for (k, v) in grads.grads
if k isa GlobalRef
@show k
@show v
end
end
The output shows that the grads.grads
IdDict
contains the expected gradients for b[1]
and b[2]
indexed by the “arrays” (I am not sure what the terminology is), but that it contains nothing
for b[3]
. The same IdDict
does, however, contain one key which is a GlobalRef
to :(Main.bs)
, and there is contains nothing
for the first two elements, but the expected gradient for the last element.
- What should this function look like to give me the expected gradient as
grads[bs[3]]
? - What determines if the gradient appears with a
GlobalRef
or array as its key in theIdDict
, and what do I need to understand to avoid this problem in the future?