Hello,
I am working on a simple example with Flux and Zygote, mainly translating this notebook to Julia.
The loss function contains the divergence (trace of the Jacobian) and Zygote gives an error when updating the model.
import ManifoldLearning: swiss_roll
using Flux
#using CUDA
using Zygote
using LinearAlgebra
# 2x1000 data
data = swiss_roll(1000,1.0)[1][[1,3],:] #|> gpu
model = Chain( Dense(2, 128, relu),
Dense(128,128,relu),
Dense(128, 2,relu)) #|> gpu
opt = ADAM(1e-3)
function score_matching(model,data)
logp = model(data)
norm_loss = sum(abs2,logp)/2
div = tr(jacobian(model,data)[1])
return 0.5(norm_loss+div)
end
ps = Flux.params(model)
function train_step!(model,data,ps,opt)
gs = gradient(() -> score_matching(model, data), ps)
Flux.Optimise.update!(opt, ps, gs)
end
train_step!(model,data,ps,opt)
gives
LoadError: Mutating arrays is not supported -- called copyto!(::SubArray{Float64, 1, Matrix{Float64}, Tuple{Int64, Base.Slice{Base.OneTo{Int64}}}, true}, _...)
The stacktrace leads to gradcopy
in Zygote.withjacobian
.
Is it not possible to differentiate a function of the Jacobian?
Even more surprising, when I use CUDA
(by including comments), I get a different error:
LoadError: Compiling Tuple{CUDA.var"##context!#59", Bool, typeof(context!), CUDA.var"#216#217"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, UInt32, DataType}, CuContext}: try/catch is not supported.
The try/catch seems to come from fill!
used by _eyelike
in Zygote.withjacobian
.
Should I avoid using the Jacobian somehow? I also tried using ForwardDiff.jacobian
instead, but I get yet another error.