I think what I am trying to do used to be possible in Zygote, based on the thread here: Gradient of gradient - #8 by martenlienen
The suggestion:
using Flux
net = Dense(10, 1)
x = randn(10, 128) # dims, batch
function pred(x, net)
y, pullback = Zygote.pullback(net, x)
grads = pullback(ones(size(y)))[1]
return grads
end
gradient(() -> sum(pred(x, net)), params(net)
Now throws the same error:
ERROR: Mutating arrays is not supported -- called copyto!(::Matrix{Float64}, _...)
Based on the comments in the linked thread, this used to work fine.