When training a model using the boundary loss function, how do you all compute gradients?
We can easily apply numerical differentiaion and get things up and running and I’ve seen a good example here. The Flux.gradient(...)
to compute gradients seems very expensive and crashes on my system. I see that NeuralPDE.jl
allows the option for AD and calls ForwardDiff.derivative()
but it doesn’t seem to be really working on my system. As I mentioned here, it gives wrong gradients for some reason and when I try to train it on an ansatz solution, I get an error. An MWE for the ansatz version is here:
using Flux, ProgressMeter, LinearAlgebra, Plots, ForwardDiff
k= 5.;
f(t)= k*t;
u0= 100;
uNN= Chain(t->[t],
Dense(1, 15, tanh),
Dense(15, 25, tanh),
Dense(25, 15, tanh),
Dense(15,1),
first
)
tgrid= sort(rand(100) .*10.);
ϵ= eps(Float64)* 1e10;
# gradients ()
NN_sol(t)= u0+ t*uNN(t)
# dNN_sol(t)= sum(gradient((t)-> NN_sol(t), t)); # this crashes
dNN_sol2(t)= ForwardDiff.derivative((t)-> NN_sol(t), t); # this throws an error
dNN_sol3(t)= (NN_sol(t+ ϵ)- NN_sol(t- ϵ))/ 2ϵ
@show dNN_sol2(0.5)
@show dNN_sol3(0.5)
loss_ODE()= Flux.mse(f.(tgrid), dNN_sol2.(tgrid));
opt= Adam(0.01);
ps= Flux.params(uNN);
epochs = 5000;
lsi= loss_ODE()
losses= [lsi];
ProgressMeter.ijulia_behavior(:clear)
p=Progress(epochs)
for i in 1:epochs
tgrid= sort(rand(50) .*10.);
gs= gradient(loss_ODE, ps);
Flux.Optimise.update!(opt, ps, gs);
lsi= loss_ODE();
push!(losses, lsi)
ProgressMeter.next!(p; showvalues = [(:epoch,i),
(:train_loss,lsi),
(:grad_norm, norm(gs, 2))])
end
# plot(losses)
xg2= 0:0.1:10;
plot(xg2, NN_sol.(xg2), label= "NN soln", linewidth= 3)
plot!(xg2, k/2 .*xg2.^2 .+ u0, label= "analytical soln", linestyle=:dash, linewidth= 3)
The Stacktrace is as:
ERROR: LoadError: MethodError: no method matching iterate(::Nothing)
Closest candidates are:
iterate(::Union{LinRange, StepRangeLen}) at range.jl:872
iterate(::Union{LinRange, StepRangeLen}, ::Integer) at range.jl:872
iterate(::T) where T<:Union{Base.KeySet{<:Any, <:Dict}, Base.ValueIterator{<:Dict}} at dict.jl:712
...
Stacktrace:
[1] isempty(itr::Nothing)
@ Base ./essentials.jl:788
[2] norm(itr::Nothing, p::Int64) (repeats 2 times)
@ LinearAlgebra /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/generic.jl:591
[3] MappingRF
@ ./reduce.jl:95 [inlined]
[4] _foldl_impl(op::Base.MappingRF{typeof(norm), Base.BottomRF{typeof(max)}}, init::Base._InitialValue, itr::Zygote.Grads)
@ Base ./reduce.jl:58
[5] foldl_impl
@ ./reduce.jl:48 [inlined]
[6] mapfoldl_impl
@ ./reduce.jl:44 [inlined]
[7] #mapfoldl#258
@ ./reduce.jl:162 [inlined]
[8] mapfoldl
@ ./reduce.jl:162 [inlined]
[9] #mapreduce#262
@ ./reduce.jl:294 [inlined]
[10] mapreduce
@ ./reduce.jl:294 [inlined]
[11] generic_normInf(x::Zygote.Grads)
@ LinearAlgebra /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/generic.jl:453
[12] normInf
@ /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/generic.jl:522 [inlined]
[13] generic_norm2(x::Zygote.Grads)
@ LinearAlgebra /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/generic.jl:463
[14] norm2
@ /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/generic.jl:524 [inlined]
[15] norm(itr::Zygote.Grads, p::Int64)
@ LinearAlgebra /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/generic.jl:593
[16] top-level scope
@ ~/Desktop/MT_fwd_surrogate/pinn_ode1.jl:45
in expression starting at /Users/asingh933/Desktop/MT_fwd_surrogate/pinn_ode1.jl:39
Sorry, if this relates a lot to the previous post. This one focuses more on the issues with ForwadDiff.jl
. Thanks in advance for any help you can provide!