Loss function in NeuralPDE.jl

Vivi · February 4, 2024, 3:52pm

Hi,
I’m trying to train a PINN using NeuralPDE.jl. I was able to start the training and achieved (at least in my opinion) a quiet small loss < 1e-7. But when I plotted the network solution against the analytic solution the two results looked very different.

That’s why I wanted to understand the loss function. Sadly I was not able to find anything about how it is definded, except that it is a combination of the boundary condition loss and the loss for the pde. So I would be very happy if someone here could help me understand how the loss is defined.

For you to better understand the problem, I have prepared my minimal working example:

using Lux, NeuralPDE, OptimizationOptimisers, Plots, Random
import ModelingToolkit: Interval

@parameters t
@variables u(..)

Dt = Differential(t)
eq = Dt(u(t)) ~ (u(t))^2 - (u(t))^3

δ = 0.01
bcs = [u(0.0) ~ δ]
domains = [t ∈ Interval(0.0, 2 / δ)]

chain = Lux.Chain(Dense(1, 8, Lux.σ), Dense(8, 8, Lux.σ), Dense(8, 1))

initParams, initState = Lux.setup(Random.default_rng(), chain)
optimiser = OptimizationOptimisers.Adam(0.01)

discretization = PhysicsInformedNN(chain, QuasiRandomTraining(1000))
@named pde_system = PDESystem(eq, bcs, domains, [t], [u(t)])
prob = discretize(pde_system, discretization)


callback = function (params, loss_val)
    ps, st = Lux.setup(Random.default_rng(), chain)
    y, st = Lux.apply(chain, [0.5], params, st)
    println("loss = ", loss_val)

    # stop optimization when loss value is smaller than 1e-7
    return loss_val < 1e-7
end

params_trained = solve(prob, optimiser, callback=callback, maxiters=2000, save_best=true)
state_trained = discretization.phi

analytic_sol_func(t) = 1 / (lambertw((1 / δ - 1) * exp(1 / δ - 1 - t)) + 1)

# plot result
dx = 0.001
xs = [infimum(d.domain):(dx/10):supremum(d.domain) for d in domains][1]
u_real = analytic_sol_func.(xs)
state_with_params(x) = state_trained(x, params_trained.u)
u_predict = first.(state_with_params.(xs))

x_plot = collect(xs)
plot(x_plot, u_real, label="real")
plot!(x_plot, u_predict, label="predict")

And here is an image of the final plot:

Eingefügtes Bild

The blue line in the image is the analytical solution and the red/orange line represents the output of the network. And as you can see those two differ quiet much.

drsk · February 5, 2024, 10:31am

Hi @Vivi,

if you also compute symbolic_discretize(pde_system, discretization), you can observe the created loss functions in the returned symbolic problem structure.

Vivi · February 6, 2024, 10:40am

Thank you very much @drsk.

I took a look in the output of the symbolic problem structure. As I’m quite new I don’t really understand how I can call this loss function to gain better understanding i.e. with the debugger. My idea was to look if the loss is nearly 0 when I put in the analytic solution.

For example I looked at the output of symbolic_discretize(pde_system, discretization).loss_functions.pde_loss_functions which is

NeuralPDE.var"#92#95"{NeuralPDE.var"#219#220"{RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:cord, Symbol("##θ#293"), :phi, :derivative, :integral, :u, :p), NeuralPDE.var"#_RGF_ModTag", NeuralPDE.var"#_RGF_ModTag", (0x4842fd3a, 0xfdea5419, 0x88b3ff40, 0x144fdfca, 0xc206fec8), Expr}, NeuralPDE.var"#12#13", NeuralPDE.var"#279#286"{NeuralPDE.var"#279#280#287"{typeof(NeuralPDE.numeric_derivative)}, Dict{Symbol, Int64}, Dict{Symbol, Int64}, QuasiRandomTraining}, typeof(NeuralPDE.numeric_derivative), NeuralPDE.Phi{Chain{NamedTuple{(:layer_1, :layer_2, :layer_3), Tuple{Dense{true, typeof(sigmoid_fast), typeof(glorot_uniform), typeof(zeros32)}, Dense{true, typeof(sigmoid_fast), typeof(glorot_uniform), typeof(zeros32)}, Dense{true, typeof(identity), typeof(glorot_uniform), typeof(zeros32)}}}, Nothing}, NamedTuple{(:layer_1, :layer_2, :layer_3), Tuple{NamedTuple{(), Tuple{}}, NamedTuple{(), Tuple{}}, NamedTuple{(), Tuple{}}}}}, Nothing}, Tuple{Vector{Float64}, Vector{Float64}}, Int64, Qua…

I assume the first parameters like cord, phi,… are the input arguments to this loss function, but what do they represent?

Or asked the other way around, as I want to learn this simple ODE function, does anyone has tips why my results are so bad. Is the loss function really the problem or am I missing something else?

drsk · February 6, 2024, 1:17pm

What does the output of the solver say? You should see the actual loss printed for every iteration. Your solver stops after a maximum of 2000 iterations, hence it could be that you haven’t converged yet. Also note that the loss function depends on the actual training strategy used.

Vivi · February 26, 2024, 10:47am

Hi, sorry for the late response, I didn’t have time for this project. But now I’m back. I do not totally understand what you mean by solver output. I print in the callback the loss value for every iteration and this is quite small. If you mean something else, I would really appreciate if you could explain it a bit more.

I also did a training with more iterations in my case 12 000 and the loss is 5e-9 (I changed the return in the callback to return false). Do you suggest to do even more iterations for convergence? And if yes do you have a suggestion on how many I need approximately?

For the training strategy, I think I use QuasiRandomTraining.

drsk · February 26, 2024, 2:22pm

I meant indeed the output of your loss function during training. Did you try different strategies? The loss function of QuasiRandomTraining is mean(abs2, loss_function(sets__, θ)), where the inner loss function is computed from your equation. But this depends also on how you sample. In your example, the loss seems zero almost everywhere outside [70, 110] and concentrated on a fairly small interval. I’d try a QuadratureTraining and see if you get a better result.

Vivi · February 29, 2024, 4:21pm

Thank you, I will try different strategies the next days.
I already tried QuadratureTraining, but the results are nearly the same as before.
I have the feeling, that my boundary condition is not considered in the loss evaluation and I think that’s the problem for those bad results, as f(t) = 0 would be a solution when ignoring the boundary condition. To try that I set the weights for the equation and the boundary condition (pde_loss_weights, bc_loss_weigths) and even when the bc_loss_weigths is much higher, the boundary condition seems not be considered. I even tried pde_loss_weights = 0. Do you have an idea if this could be the problem and how to solve it ?

drsk · March 4, 2024, 9:18am

Looking at your equation closer, I think the solution will be very sensitive to the initial condition u(0) = \delta, and your delta is already tiny. Can you see what happens when you choose a delta with a different magnitude, say delta ~ 0.5? I don’t think there is a problem in NeuralPDE, but just an extremely high sensitivity of the solutions to the initial condition.

Vivi · March 8, 2024, 11:30am

Hi, you are right, when I tune the delta parameter to 0.5 the result approaches the analytical solution, not totally but at least it is near. So I guess it’s not a problem in NeuralPDE but in my system instead. Thanks for your help !

Topic		Replies	Views
NeuralPDE debuging: is it posible to get the loss of a point in the domain? Modelling & Simulations	2	183	April 7, 2024
Specify loss function directly for NeuralPDE? General Usage	7	504	November 21, 2024
Mixed NN and functions fail with NeuralPDE Modelling & Simulations question	7	240	September 5, 2023
Modified loss function in NeuralPDE General Usage question , neural-network	4	216	July 15, 2023
Error with Relu activation function when solving problem on NeuralPDE General Usage	20	563	August 6, 2023

Loss function in NeuralPDE.jl

Related topics