NeuralPDE cubature issue

Hi,
I’m moving my first steps with NeuralPDE.jl and I came up with this issue. I want to solve a fairly simple PDE:

h(ω) = exp.(-(γ-1)*ω)
μω = λ*(ω-ωb)
σω = σ
eq  = 1/100*Dτ(f(τ,ω)) ~ 0.5*σω^2*Dωω(f(τ,ω)) - μω*Dω(f(τ,ω)) - k*f(τ,ω) + h(ω) 

where \omega\in[-2, 2] and \tau\in[0,1]. I managed to get a decent solution (just by visual comparison with the finite difference one, see figure below for a cross section at time \tau=1) by using GridTraining:


On the other hand, if I use QuadratureTraining, with say CubatureJLh (one of the suggested algos) I get a small loss function but the solution is nowhere near the real one (same cross section as before):


I tried different cubature methods and played around with some parameters with no luck.

Am I doing something obviously wrong? Is the problem particularly not suited to cubature? Any insight would be appreciated :slight_smile:

Here’s some of the code I use to solve the problem if it might help:

τ_min= 0.
τ_max = 1.0
τ_span = (τ_min,τ_max)

ω_min = -2.
ω_max = 2.

dω = 0.05
dt_NN = 0.01
dt_BSDE = 0.01

μY = 0.03
σ = 0.03
λ = 0.1
ωb = μY/λ

γ = 1.5
k = 0.1
h(ω) = exp.(-(γ-1)*ω)


@parameters τ ω
@variables f(..)

Dω = Differential(ω)
Dωω = Differential(ω)^2
Dτ = Differential(τ)

μω = λ*(ω-ωb)
σω = σ

# PDE
eq  = 1/100*Dτ(f(τ,ω)) ~ 0.5*σω^2*Dωω(f(τ,ω)) - μω*Dω(f(τ,ω)) - k*f(τ,ω) + h(ω) 

# Initial and boundary conditions
bcs = [f(τ_min,ω) ~ h(ω),
       Dω(f(τ,ω_max)) ~ 0,
       Dω(f(τ,ω_min)) ~ 0]

# Space and time domains
domains = [τ ∈ Interval(τ_min,τ_max),
           ω ∈ Interval(ω_min,ω_max)]

τs,ωs = [infimum(d.domain):dω:supremum(d.domain) for d in domains]


@named pde_system = PDESystem(eq,bcs,domains,[τ,ω],[f(τ, ω)])

# NN parameters
dim =2
hls = dim+50


# Neural network
    chain = Chain(Dense(dim,hls,Flux.σ),
                  Dense(hls,hls,Flux.σ),
                  Dense(hls,hls,Flux.σ),
                  Dense(hls,1)) 

    
    strategy = GridTraining([dω,dt_NN])

    # strategy = QuadratureTraining(; quadrature_alg = CubatureJLp(), 
    # reltol = 1e-6, abstol = 1e-3,
    # maxiters = 1_000)

    discretization = PhysicsInformedNN(chain,
                                   strategy)

    prob = discretize(pde_system,discretization)

    callback = function (p,l)
        println("Current loss is: $l")
        return false
    end

 
    res = Optimization.solve(prob,Flux.ADAM(0.01);callback = callback,maxiters=500)
    prob = remake(prob,u0=res.u)
    
    res = Optimization.solve(prob,Flux.ADAM(0.001);callback = callback,maxiters=2000)
    prob = remake(prob,u0=res.u)
    
    bfgs = OptimizationOptimJL.BFGS()
    res = Optimization.solve(prob,bfgs ;callback = callback,maxiters=100)
    prob = remake(prob,u0=res.u)
    

    phi = discretization.phi
    f_NN = [first(Array(phi([τ, ω], res.u))) for τ in τs for ω in ωs]

    f_NN = reshape(f_NN,(length(ωs),length(τs)))

You’re not doing anything obviously wrong. You’ll see this sometimes in [2107.09443] NeuralPDE: Automating Physics-Informed Neural Networks (PINNs) with Error Approximations . Honestly it’s not entirely clear what’s the cause. I think it’s because BFGS is not good for stochastic loss functions. While technically a convergent integral is not a stochastic result, when the result is too far off the integral error is high which can probably make it seem stochastic, and that can cause BFGS to get stuck in a (bad) local minimum. But yeah, if you lower the tolerances I think it goes away? I’m still trying to fully understand the behavior so :sweat_smile: quasi-random sampling seems to outperform it in such cases.

Mmm, it actually gets worse (here abstol 1e-8, reltol = 1e-6):


Now, I have some random thoughts:

  1. I specified reflective boundary conditions while my starting condition does not have 0 derivative at the boundary, could this prove to be a problem for cubature algos?
  2. It looks that the optimization wants to converge to a discontinuous function

Yes, definitely. If some definitional issue like this they will adapt to try and satisfy that. Your initial condition should satisfy the boundary condition.

So, I did some more digging and I still do not understand what’s happening.
I’ve been working with a variation of the previous problem. The code that defines the problem is the following:

# Parameters
τ_min= 0.
τ_max = 1.0
τ_span = (τ_min,τ_max)

ω_min = -2.
ω_max = 2.

#Grid if needed
dω = 0.05
dt_NN = 0.01

μY = 0.03
σ = 0.03
λ = 0.1
ωb = 0

γ = 1.5
k = 0.1

#Initial Condition
h(ω) = exp.(-(γ-1)*5*ω.^2)


@parameters τ ω
@variables f(..)

Dω = Differential(ω)
Dωω = Differential(ω)^2
Dτ = Differential(τ)

μω = λ*(ω-ωb)
σω = σ

# PDE
eq  = 1/100*Dτ(f(τ,ω)) ~ 0.5*σω^2*Dωω(f(τ,ω)) - μω*Dω(f(τ,ω)) - k*f(τ,ω) + h(ω) 

# Initial and boundary conditions
bcs = [f(τ_min,ω) ~ h(ω),
       Dω(f(τ,ω_max)) ~ 0,
       Dω(f(τ,ω_min)) ~ 0]

# Space and time domains
domains = [τ ∈ Interval(τ_min,τ_max),
           ω ∈ Interval(ω_min,ω_max)]

τs,ωs = [infimum(d.domain):dω:supremum(d.domain) for d in domains]


@named pde_system = PDESystem(eq,bcs,domains,[τ,ω],[f(τ, ω)])

# NN parameters
dim =2
hls = dim+50

As you can see the PDE is parabolic and not particularly wild. It has Neumann boundary conditions and the initial condition is a centered Gaussian.
If I solve it with the grid strategy I get what I expect:
Initial Condition:


Value at the terminal time:

If instead I use the following code for the quadrature strategy

# Neural network
    
    chain = Lux.Chain(Dense(dim,hls,Lux.σ),
                      Dense(hls,hls,Lux.σ),
                      Dense(hls,1))

    strategy = QuadratureTraining(; quadrature_alg = CubatureJLp(), 
    reltol = 1e-5, abstol = 1e-8,
    maxiters = 1_000)

    discretization = PhysicsInformedNN(chain,
                                   strategy)

    prob = discretize(pde_system,discretization)

    callback = function (p,l)
        println("Current loss is: $l")
        return false
    end

 
    res = Optimization.solve(prob,Adam(0.01);callback = callback,maxiters=5000)
    prob = remake(prob,u0=res.u)

I get a terrible solution:
Initial Condition


Value at the terminal time:

Even if the loss value is of the order of 1e-5.
What am I doing wrong?

Can you open an issue? We can look at this in more detail.