Solve a routine economics PDE from an HJB w/ ModelingToolkit

Albert_Zevelev · November 7, 2020, 2:46am

Consider a routine continuous time optimization problem from economics:
V(t,a_{t}) := \max \int_{\tau=t}^{\tau = T} e^{-\rho (\tau -t)} u(c_{\tau})d\tau \text{ s.t. }
\dot{a}_{t} = y + ra_{t} - c_{t},
a_{0} \text{ given, } a_{T}=0.
Assume y & r are constants and u(c)=\frac{c^{1-\gamma}}{1-\gamma}.

The problem has a known closed form solution:

Here I plot V(0,a_{t}) and V(t,1):

To solve this using the HJB approach requires solving a PDE:
\left[\begin{array}{l} \rho V(t,a_{t}) = \frac{c_{t}^{1-\gamma}}{1-\gamma} + V_{a}(t,a_{t})\times \left(y + ra_{t} - c_{t} \right) + V_{t}(t,a_{t}) \\ c(t,a_{t}) = (V_{a}(t,a_{t}))^{-\frac{1}{\gamma}} \\ V(T,a_{T}) = 0 \end{array} \right]

I’m trying to solve it w/ ModelingToolkit.jl, but it’s taking forever & not converging.

using NeuralPDE, Flux, ModelingToolkit, GalacticOptim, Optim, DiffEqFlux
using Plots

@parameters t a θ
@variables V(..)
@derivatives Da'~a
@derivatives Dt'~t

#
u(c,γ)= (c^(1.0 -γ))/(1.0 -γ);
γ  = 2.00; ρ = 0.01; r = 0.01; y = 0.00; T = 5.0;
ω  = r - (r-ρ)/γ;
# Closed form solution.
VS(t,a;y=y,r=r,γ=γ,ω=ω,T=T)=
    u(a+(y/r), γ) * ( (1.0 - exp(-ω*(T-t))) / ω )^γ;
# if r=ρ
CS(t,a;y=y,r=r,γ=γ,ω=ω,T=T, a0=1) = (r*a0)/(1.0 - exp(-r*T) )
AS(t,a;y=y,r=r,γ=γ,ω=ω,T=T, a0=1) =
    (a0)/(1.0 - exp(-r*T) ) +
    a0*(1.0 - 1.0/(1.0 - exp(-r*T) ) ) * exp(r*t);

plot(  VS.(0.0,0.01:.01:5.0) )
plot( VS.(0.0:.01:5.0,1.0) )

#
v  = V(t,a,θ);
va = max(eps(), Da(V(t,a,θ)) );
vt = Dt(V(t,a,θ));
c  = (va)^(-1.0/γ);
#uc = u(c,γ)
uc = (va^(1.0 - 1.0/γ) )/(1.0 - γ);
# HJB
eq  = ρ*v ~ uc + va*(y + r*a - c) + vt;

# Boundary conditions
bcs = [V(T,a,θ) ~ 0.0] #0.f0
# Space and time domains
domains = [t ∈ IntervalDomain(0.0,5.0),
           a ∈ IntervalDomain(0.0,5.0)]
# Discretization
dx = 0.1

# Neural network
dim = 2 # number of dimensions
chain = FastChain(FastDense(dim,16,Flux.σ),FastDense(16,16,Flux.σ),FastDense(16,1))
discretization = PhysicsInformedNN(dx,chain)

pde_system = PDESystem(eq,bcs,domains,[t,a],[V])
prob = discretize(pde_system,discretization)

cb = function (p,l)
    println("Current loss is: $l")
    return false
end

res = GalacticOptim.solve(prob, Flux.ADAM(); cb = cb, maxiters=10)
res = GalacticOptim.solve(prob, Optim.BFGS(); cb = cb, maxiters=10)

#
phi = discretization.phi
ts,as = [domain.domain.lower:dx/10:domain.domain.upper for domain in domains]
u_predict = reshape(
    [first(phi([t,a],res.minimizer)) for t in ts for a in as],
    (length(ts),length(as))
    )

# plot V(t=0,a)  #u_predict[1,:]
plot(as, u_predict[1,:])  
# plot V(t, a=0.01)
plot(ts, u_predict[:,1])

it takes forever & is not accurate
originally I plugged in the exact equation but got an error

ERROR: DomainError with -0.05701087375174788:
Exponentiation yielding a complex result requires a complex argument.
Replace x^y with (x+0im)^y, Complex(x)^y, or similar.

the issue is the pde has terms like V_{a}(t,a_{t})^{-\frac{1}{\gamma}} and we have numerical problems related to dividing by zero or taking the root of negative numbers.
We would like to use the prior knowledge that V_{a}>0.
following a suggestion @matthieu , I tried max(eps(), va)^(-1/γ)
No more domain error, but it takes forever & is not accurate.

Note: if \gamma >1 the trivial solution V(t,a_{t})=0 also satisfies the PDE.
It can be avoided w/ a bequest motive…

rveltz · November 7, 2020, 7:38am

dont you need a boundary condition?

Albert_Zevelev · November 7, 2020, 8:33am

@rveltz yes.
It’s V(T,a_{T})=0 for the unknown function V(t,a_{t}).
In the code bcs = [V(T,a,θ) ~ 0.0].

That said, most successful attempts to solve HJBs use viscocity solutions…

rveltz · November 7, 2020, 9:00am

Does the numerical algo implements the ideas behind viscosity solutions?

Albert_Zevelev · November 7, 2020, 9:22am

You’ll have to ask the people who created the package…
I know the code uses a new gen of methods that solves pdes w/ Neural Networks.
Those methods have been shown to be very successful at solving high-dim HJB equations.
I also know that the package is still in it’s early phase.

To clarify, I’m trained as an economist (my undergrad was in math but no PDEs).
That means I know how to set up a problem that boils down to a pde, but I count on experts to make solvers where I can plug in my PDE (do a little tuning) and voila.

I’ve used Mathematica & a few others. I wanna switch to Julia.

Tamas_Papp · November 8, 2020, 1:08pm

I am not quite sure about this, but in any case for the above problem a spectral method (eg Chebyshev polynomials) should be very accurate and super-fast (either limit the domain to a relevant interval [recommended], or transform [-1,1] to \mathbb{R}).

A change of variables can improve on this even further.

ChrisRackauckas · November 8, 2020, 5:56pm

I think I just mentioned this somewhere else too. If you know the solution is positive, then force it.

FastChain(FastDense(dim,16,Flux.σ),FastDense(16,16,Flux.σ),FastDense(16,1),(x,p)->x^2)

That’s a neural network that is a universal approximator for positive functions.

Albert_Zevelev · November 8, 2020, 9:12pm

Sorry I wasn’t clear.
The solution to the PDE V(t,a_{t}) is not positive in general (see plots in my original post above).

I know that c>0, and since V_{a}(t,a_{t})=c^{-\gamma} and \gamma>0, we will have V_{a}(t,a_{t})>0.

I don’t know how to incorporate this information (V_{a}(t,a_{t})>0).

ChrisRackauckas · November 9, 2020, 1:24am

So you’re looking for a monotonic neural network?

Albert_Zevelev · November 9, 2020, 1:59am

Yes, in this case, I know that the unknown function V(t,a_{t}) is monotonically increasing in both it’s arguments. (is that what you’re asking?)

The PDE that needs to be solved:
\left[\begin{array}{l} \rho V(t,a_{t}) = \frac{ (V_{a} )^{\frac{\gamma-1}{\gamma}}}{1-\gamma} + V_{a}(t,a_{t})\times \left(y + ra_{t} - (V_{a})^{-\frac{1}{\gamma}} \right) + V_{t}(t,a_{t}) \\ V(T,a_{T}) = 0 \end{array} \right]

In my example \gamma =2
\left[\begin{array}{l} \rho V(t,a_{t}) = -(V_{a} )^{\frac{1}{2}} + V_{a}(t,a_{t})\times \left(y + ra_{t} - (V_{a})^{-\frac{1}{2}} \right) + V_{t}(t,a_{t}) \\ V(T,a_{T}) = 0 \end{array} \right]
The problem is the term (V_{a})^{-\frac{1}{2}} .
I just want it to know that V_a >0 so it doesn’t worry about dividing by zero or taking roots of negative numbers…

ChrisRackauckas · November 9, 2020, 2:17am

Yes. So you’d be looking for a monotonic neural network. I don’t exactly know of an architecture that gives that, but that’s what you’re looking for.

jlperla · November 9, 2020, 6:20pm

It doesn’t really work to discretize this way in general, though it can sometimes in specific circumstances. The issue is that control problems have a very peculiar backwards/forwards structure, which leads to specialized discretization schemes where some parts of the equation are discreitized forwards and others backwards. Then things converge to the viscosity solution. Not to say that other approaches can’t work, but the convergence of these things is very complicated… You can’t just throw equations at an optimizer.

With spectral-style methods (or just using a neural network with a function) rather than finite difference methods you probably can get away just minimizing residuals with all of your equations with some neural network approximation, but I can say with experience that they are very finicky for HJBE problems with a control. Boundary conditions are tricky… If you want to do transition dynamics without a control choice, in which case it is kind of like a HJBE without much of an “H”, then you can use simpler discretizations.

For something along the lines of what you are describing I think it is a little premature in terms of the theory. For one thing, you are describing a finite time-horizon setup, which is of limited usefulness to most economists, but mostly it is that there is a vast literature on this due to the peculiarities of these equations. Also, I think it very unlikely that neural networks will beat the standard algorithms (e.g. codes - HACT - Benjamin Moll - University of Princeton ) for this dimensionality.

anon92994695 · November 9, 2020, 7:19pm

Not sure if this helps or confuses. I have some code for monotonic regression. Might have a bug or something, doubtful it’s differentiable. Pretty sure it’s an active set optimization - but it’s been a while since I’ve looked…
https://github.com/caseykneale/ChemometricsTools.jl/blob/cf53867beb273bffc6deb22d8d060a09c68278ae/src/RegressionModels.jl#L273

Tamas_Papp · November 10, 2020, 7:54am

My experience is the opposite: they are fast, especially if you get a good initial guess (from the previous set of parameters if you are estimating, or some trick like consistent future approximations), and very accurate. They are also simple to implement (basically working through Miranda-Fackler makes you a black-belt HJB ninja).

In contrast, treating a HJBE like a PDE and discretizing is something that I never manage to implement reliably. The problem is not that I cannot get it working for one set of parameters, but when I am estimating a model it has to work for all sorts of wacky parameters all the time.

jlperla · November 10, 2020, 8:36pm

Certainly they are amazing in discrete time, and for stationary problems in continuous time (with simple enough boundary conditions). My issues might have all come down to boundary conditions.

But if you have non-stationary problems (i.e. the terminal T in the quetsion he asked) then I am not sure if Miranda-Fackler has examples for bellmans/stochastic control? Do you go spectral in time? I am not sure what they would suggest for PDEs rather than the stationary ODEs.

But @Albert_Zevelev he is 100% right to start with Miranda and Fackler for this. In some ways, I think it is best to think of the neural network stuff as more similar to the collocation approaches than finite-differences… but there may be different ways to set it up.

But don’t forget that one of the requirements for these sorts of spectral or collocation based approaches is taking the derivative of the function you are approximating with. With a polynomial basis this is easy (especially using compecon) but with neural networks you would need to get the derivative. No problem, but then you also need to differentiate that to get the gradients to minimize the loss. So you need to do reverse over forward diff or something. Not impossible, but this would take tinkering and the convergence properties will end up being original research, were I to guess.

Tamas_Papp · November 11, 2020, 7:26am

Personally I don’t have a lot of experience with these models, but yes, I have seen people do this.

But of course — that’s what always happens. It will consume 97% of the time spent on the project, and end up in the appendix in a very shortened form

rveltz · November 11, 2020, 9:45am

Do you have example of code where this is done?

sgjanssens · November 11, 2020, 12:24pm

About this I am not sure. New and rigorous convergence proofs have a more prominent place in the numerical analysis literature. (Whether in this case such a proof would indeed be new, I cannot judge. It is not my field.) If you can demonstrate convergence and apply it to an interesting class of problems, it is even better, of course.

Tamas_Papp · November 11, 2020, 1:25pm

That may be, but unfortunately general equilibrium problems in economics are quite badly behaved, so most methods are ad hoc (“we did that, and it worked”). Also, from reading what people use in physics, the applications are about 2 decades behind.

Tamas_Papp · November 12, 2020, 10:40am

I have seen it at a conference a couple of years ago, tried to find it but couldn’t, sorry. But it would not help with Julia code either I guess (this was years ago).

But I see no reason why approximating V(t, a) with Chebyshev polynomials (tensored, or Smolyak if you want to be really fancy) shouldn’t just work. I would just deal with the possible sign problem with \max(V_a, 0), or a smooth maximum with small scale to get an initial solution near the correct one.

Topic		Replies	Views
Solving HJB PDE using deep learning Numerics economics , pde	16	3471	January 19, 2022
Solving a PDE which comes from an HJB in DifferentialEquations.jl General Usage diffeq	15	3291	September 9, 2019
Solving the 4 quadrants of dynamic optimization problems in Julia. Help Wanted! Finance and Economics optimization	35	6205	April 22, 2024
Non-DSLs for solving intertemporal optimization problems (not just for econ) Finance and Economics optimization	2	1474	August 14, 2022
Black scholes pde General Usage latex	3	1267	December 31, 2020

Solve a routine economics PDE from an HJB w/ ModelingToolkit

Related topics