I have a complex non-linear two-point boundary value problem to solve. It’s very sensitive to how close the initial guess is to the final converged answer. I start by solving in a region where convergence is easier, then use continuation to gradually move to the parameter space I want. However, no matter how small a step I use the problem eventually won’t converge at all. In my investigation I found odd behavior concerning how the initial guess is used in the boundary conditions.
Here is a simplified MWE to show what I mean. It’s a simple thermal problem of a bar of material extending from t=0 to t=1, with uniform internal heat generation. At t=0 the temperature is fixed at 100 degrees C and the axial heat flow is zero. At t=1 the heat flow is convective with an unknown coefficient of convection and an environmental temperature of 20 degrees C. I use a FIRK solver (the same as in my actual, nonlinear problem) to solve for the temperature distribution and the convective coefficient. Here is the code:
using BoundaryValueDiffEq, NonlinearSolve, ForwardDiff
tspan = (0.0, 1.0)
function f!(du, u, p, t)
cond = 0.002
vol_heat = 0.2
du[1] = -u[2]/cond
du[2] = vol_heat
du[3] = 0.0
end
function bca!(res_a, u_a, p)
res_a[1] = u_a[2]
res_a[2] = u_a[1] - 100.0
if p[1]>0.0
println("t = 0.0 ", " u_a[1] = ", ForwardDiff.value(u_a[1]), " u_a[2] = ", ForwardDiff.value(u_a[2]), " u_a[3] = ", ForwardDiff.value(u_a[3]))
println("")
end
end
function bcb!(res_b, u_b, p)
tref = 20.0
res_b[1] = u_b[3] * (u_b[1] - tref) - u_b[2]
if p[1]>0.0
println("t = 1.0 ", " u_b[1] = ", ForwardDiff.value(u_b[1]), " u_b[2] = ", ForwardDiff.value(u_b[2]), " u_b[3] = ", ForwardDiff.value(u_b[3]))
println("")
end
end
u_guess = [ [100.0, 0.0, 0.006666666666666668],
[99.5, 0.020000000000000004, 0.006666666666666668],
[98.0, 0.04000000000000001, 0.006666666666666668],
[95.5, 0.060000000000000005, 0.006666666666666668],
[92.0, 0.08000000000000002, 0.006666666666666668],
[87.5, 0.1, 0.006666666666666668],
[82.0, 0.12000000000000001, 0.006666666666666668],
[75.5, 0.14, 0.006666666666666668],
[68.0, 0.16000000000000003, 0.006666666666666668],
[59.49999999999999, 0.18000000000000002, 0.006666666666666668],
[50.0, 0.2, 0.006666666666666668] ]
bvp1 = TwoPointBVProblem(f!, (bca!, bcb!), u_guess, tspan, [-1.0]; bcresid_prototype = (zeros(2), zeros(1)))
sol1 = solve(bvp1, RadauIIa5(nlsolve = NewtonRaphson()), dt = 0.1, adaptive = false; nlsolve_kwargs = (; maxiters = 10, show_trace = Val(true)))
bvp2 = TwoPointBVProblem(f!, (bca!, bcb!), sol1, tspan, [1.0]; bcresid_prototype = (zeros(2), zeros(1)))
sol2 = solve(bvp2, RadauIIa5(nlsolve = NewtonRaphson()), dt = 0.1, adaptive = false; nlsolve_kwargs = (; maxiters = 10, show_trace = Val(false)))
In the first call to the RaduaIIa5 solver I use a vector of vectors as the initial guess. These values are from a converged case. In the second call to the solver I use a solution (sol1) object as the initial guess. I print the value part of the dual numbers that represent the three state variables (temperature, heat flux, and convective coefficient) for every time the boundary condition functions are called. Here is the output:
Algorithm: NewtonRaphson(
descent = NewtonDescent(),
autodiff = AutoForwardDiff(),
vjp_autodiff = AutoFiniteDiff(
fdtype = Val{:forward}(),
fdjtype = Val{:forward}(),
fdhtype = Val{:hcentral}(),
dir = true
),
jvp_autodiff = AutoForwardDiff(),
concrete_jac = Val{false}()
)
---- ------------- -----------
Iter f(u) 2-norm Step 2-norm
---- ------------- -----------
0 7.07815150e+02 0.00000000e+00
1 2.49915494e-13 1.08699193e+03
Final 2.49915494e-13
----------------------
t = 0.0 u_a[1] = 100.0 u_a[2] = -7.966160064343519e-17 u_a[3] = 0.0066666666666666706
t = 1.0 u_b[1] = 100.0 u_b[2] = -7.966160064343519e-17 u_b[3] = 0.0066666666666666706
t = 0.0 u_a[1] = 100.0 u_a[2] = -7.966160064343519e-17 u_a[3] = 0.0066666666666666706
t = 1.0 u_b[1] = 100.0 u_b[2] = -7.966160064343519e-17 u_b[3] = 0.0066666666666666706
t = 0.0 u_a[1] = 100.0 u_a[2] = -7.966160064343519e-17 u_a[3] = 0.0066666666666666706
t = 1.0 u_b[1] = 100.0 u_b[2] = -7.966160064343519e-17 u_b[3] = 0.0066666666666666706
t = 0.0 u_a[1] = 100.0 u_a[2] = -7.966160064343519e-17 u_a[3] = 0.0066666666666666706
t = 1.0 u_b[1] = 100.0 u_b[2] = -7.966160064343519e-17 u_b[3] = 0.0066666666666666706
t = 0.0 u_a[1] = 100.0 u_a[2] = -7.966160064343519e-17 u_a[3] = 0.0066666666666666706
t = 1.0 u_b[1] = 100.0 u_b[2] = -7.966160064343519e-17 u_b[3] = 0.0066666666666666706
t = 0.0 u_a[1] = 100.0 u_a[2] = -1.0967901537864265e-16 u_a[3] = 0.006666666666666661
t = 1.0 u_b[1] = 50.000000000000064 u_b[2] = 0.19999999999999982 u_b[3] = 0.006666666666666661
t = 0.0 u_a[1] = 100.0 u_a[2] = -1.0967901537864265e-16 u_a[3] = 0.006666666666666661
t = 1.0 u_b[1] = 50.000000000000064 u_b[2] = 0.19999999999999982 u_b[3] = 0.006666666666666661
t = 0.0 u_a[1] = 100.0 u_a[2] = -1.0967901537864265e-16 u_a[3] = 0.006666666666666661
t = 1.0 u_b[1] = 50.000000000000064 u_b[2] = 0.19999999999999982 u_b[3] = 0.006666666666666661
The output, which is the same for either a vector initial guess or a solution object initial guess, shows my issue. The boundary condition at t=0 (function bca!) is first called, and it uses the proper values of the state variables from the initial guess. Then bcb!, at t=1, is called and it uses the state variables’ guessed values at t=0, that is, the values for the wrong boundary condition. In fact, it’s only on the 6th call to bcb! that the proper values for t=1 appear.
For this simple linear MWE there isn’t a problem: bcb! has no issue with these incorrect values, and the iteration easily converges to the correct ones. For my real problem however, this type of behavior can cause either an exception or return of NaN values for residuals from “bcb!”, no matter how close my initial guess is to the final answer.
I have trouble believing that this is the designed behavior. Would someone at SciML please take a look at this issue? My code development is stalled until this is resolved. Thanks.