I am working on the newest Flux branch and noticed some weird behavior.
When I calculate the loss of my model’s outputs I get the following number:
> y_pred = model(x) > loss(y_pred, y) 0.003791215f0
However, doing the same calculation while taking gradients gives a different result:
> ps = Params(params(model)) gradient(ps) do y_pred = model(x) l = loss(y_pred, y) println(l) return l end 0.035433643
Maybe there is something obvious I’m missing, otherwise I’ll probably have to put together a minimum working example.
Thanks for any help!