 # Help speeding up finite horizon dynamic programming problem

#1

I am struggling to speed up the computation of solutions for a finite horizon dynamic programming problem. I am using interpolation to avoid computing the value function outside of points in the grid. Right now it takes my computer around 7 minutes to solve this 5-period model.

Please have a look and let me know what I can improve. Any suggestions you may have will help me!

Thanks!

``````# VFI finite horizon
using Optim
using Distributions
using Random
using Interpolations

# transformation functions
# open interval (a, b)
function logit(x; a = 0, b = 1)
(b - a) * (exp(x)/(1 + exp(x))) + a
end

function utility(c::R, L::R) where R <: Real
if c <= 0 || L <= 0
return -Inf
else
return log(c) + log(1 - L)
end
end

const T = 5 # terminal period
const β = 0.95 # discount factor
const r = 0.05 # interest rate
const dist = LogNormal(0, 1) # distribution of ξ
const n = 1 # number of points of support of ξ

Random.seed!(1234)
const w = Vector{Float64}(undef, T) # exogenous wages

w .= (900 .+ 20.0 .* (1:T) .- 0.5 .* (1:T).^2)

const grid_A = -1000:10:8_000

const ξ = quantile.(dist, 0:1/n:prevfloat(1.0))

const Vdict = Array{Float64}(undef, (n, length(grid_A), T))
const Cdict = Array{Float64}(undef, (n, length(grid_A), T))
const Ldict = Array{Float64}(undef, (n, length(grid_A), T))
const A1dict = Array{Float64}(undef, (n, length(grid_A), T))
const convdict = Array{Bool}(undef, (n, length(grid_A), T))

# solve last period
for s in 1:length(grid_A)
for i in 1:n
opt = optimize( x -> -utility(w[T]*x + grid_A[s]*(1+r), x), 0.0, 1.0 )
Ldict[i, s, T] = Optim.minimizer(opt)
Cdict[i, s, T] = (w[T] + ξ[i])*Ldict[i, s, T] + grid_A[s]*(1+r)
A1dict[i, s, T] = 0.0
Vdict[i, s, T] = -Optim.minimum(opt)
end
end

interp_func(t) = LinearInterpolation(grid_A, sum(Vdict[i, :, t] for i in 1:n) ./ n, extrapolation_bc = Line())

# solve rest of periods
@time for t in T-1:-1:1
for s in 1:length(grid_A)
for i in 1:n
initial_x = [A1dict[i, s, t+1], 0.0]
# x is assets to carry forward, x is labor supply
opt = optimize( x -> -(utility(logit(x)*(w[t] + ξ[i])+ grid_A[s]*(1+r) - x, logit(x)) +
β*interp_func(t+1)(x) ),
initial_x, method = NewtonTrustRegion(), autodiff = :forward, iterations = 100_000)
Vdict[i, s, t] = -Optim.minimum(opt)
xstar = Optim.minimizer(opt)
Ldict[i, s, t] = logit(xstar)
A1dict[i, s, t] = xstar
Cdict[i, s, t] = Ldict[i, s, t] * (w[t] + ξ[i]) + grid_A[s] * (1+r) - A1dict[i, s, t]
convdict[i, s, t] = Optim.converged(opt)
end
end
println("period ", t, " finished")
end
``````
#2

The first thing I would suggest is also the very first thing on the list of performance tips: put all of your code into functions rather than relying on global variables. Having your code in functions will also make it easier to read and optimize further.

1 Like
#3

Does it really matter? All those variables are `const`s and the rest are local to `for` loops.

#4

It probably won’t matter here since most of your time is likely inside some library function like `optimize`. But it could. If you don’t get rid of the most simple performance problems that people encounter, it will get pointed out to you here. So to get the most relevant responses you should try to follow https://docs.julialang.org/en/v1/manual/performance-tips/index.html first and then, if possible, do some initial profiling to see if you can give some info about what is slow.

#5

In addition to what @kristoffer.carlsson said , I’d be concerned about https://github.com/JuliaLang/julia/issues/15276 and the closures created in the above code, and it will be much easier to check for that issue (and fix it) when dealing with functions rather than dealing with global state.

1 Like
#6

Ok, here is my naive attempt at addressing your comments. Is this better?

``````using Optim
using Interpolations

struct DP{Ty <: Real}
u::Function
T::Int
β::Ty
r::Ty
n::Int
w::Vector{Ty}
grid_A::Vector{Ty}
ξ::Vector{Ty}
end

function solvelast!(dp::DP{Ty}, Ldict::Array{Ty}, Cdict::Array{Ty}, A1dict::Array{Ty}, Vdict::Array{Ty}) where Ty <: Real
utility = dp.u
grid_A = dp.grid_A
n = dp.n
w = dp.w
ξ = dp.ξ
r = dp.r
T = dp.T
for s in 1:length(grid_A)
for i in 1:n
opt = optimize( x -> -utility(w[T]*x + grid_A[s]*(1+r), x), 0.0, 1.0 )
Ldict[i, s, T] = Optim.minimizer(opt)
Cdict[i, s, T] = (w[T] + ξ[i])*Ldict[i, s, T] + grid_A[s]*(1+r)
A1dict[i, s, T] = 0.0
Vdict[i, s, T] = -Optim.minimum(opt)
end
end
return Ldict, Cdict, A1dict, Vdict
end

# transformation functions
# open interval (a, b)
function logit(x; a = 0, b = 1)
(b - a) * (exp(x)/(1 + exp(x))) + a
end

# interpolation
interp_func(t) = LinearInterpolation(grid_A, sum(Vdict[i, :, t] for i in 1:n) ./ n, extrapolation_bc = Line())

function solverest!(dp::DP{Ty}, Ldict::Array{Ty}, Cdict::Array{Ty}, A1dict::Array{Ty}, Vdict::Array{Ty}, convdict::Array{Bool}; t0::Int=1) where Ty <: Real
utility = dp.u
grid_A = dp.grid_A
n = dp.n
w = dp.w
ξ = dp.ξ
r = dp.r
T = dp.T

for t in T-1:-1:t0
for s in 1:length(grid_A)
for i in 1:n
initial_x = [A1dict[i, s, t+1], 0.0]
# x is assets to carry forward, x is labor supply
opt = optimize( x -> -(utility(logit(x)*(w[t] + ξ[i])+ grid_A[s]*(1+r) - x, logit(x)) +
β*interp_func(t+1)(x) ),
initial_x, method = NewtonTrustRegion(), autodiff = :forward, iterations = 100_000)
Vdict[i, s, t] = -Optim.minimum(opt)
xstar = Optim.minimizer(opt)
Ldict[i, s, t] = logit(xstar)
A1dict[i, s, t] = xstar
Cdict[i, s, t] = Ldict[i, s, t] * (w[t] + ξ[i]) + grid_A[s] * (1+r) - A1dict[i, s, t]
convdict[i, s, t] = Optim.converged(opt)
end
end
println("period ", t, " finished")
end
return Ldict, Cdict, A1dict, Vdict, convdict
end
``````

I then instantiate the problem and solve it:

``````using Distributions

function u(c::R, L::R) where R <: Real
if c <= 0 || L <= 0
return -Inf
else
return log(c) + log(1 - L)
end
end

T = 5 # terminal period
β = 0.95 # discount factor
r = 0.05 # interest rate
dist = LogNormal(0, 1) # distribution of ξ
n = 1 # number of points of support of ξ

w = (900 .+ 20.0 .* (1:T) .- 0.5 .* (1:T).^2)

grid_A = collect(-1000:10.0:8_000)

ξ = quantile.(dist, 0:1/n:prevfloat(1.0))

Vdict = Array{Float64}(undef, (n, length(grid_A), T))
Cdict = Array{Float64}(undef, (n, length(grid_A), T))
Ldict = Array{Float64}(undef, (n, length(grid_A), T))
A1dict = Array{Float64}(undef, (n, length(grid_A), T))
convdict = Array{Bool}(undef, (n, length(grid_A), T))

dp = DP(u, T, β, r, n, w, grid_A, ξ)

solvelast!(dp, Ldict, Cdict, A1dict, Vdict)
solverest!(dp, Ldict, Cdict, A1dict, Vdict, convdict)
``````
#7

Definitely an improvement Have you tried benchmarking any of the parts of your process independently? Or have you tried profiling with `Profile` and https://github.com/timholy/ProfileView.jl ? And have you ensured that everything is type-stable by running `@code_warntype` ? For example, your `DP` struct has a field with type `::Function`, but `Function` is an abstract type, so accessing that field inside the loop will likely perform suboptimally. You can instead make your type something like :

``````struct DP{Ty <: Real, F <: Function}
u::F
...
``````

Running `@code_warntype` should show the instability caused by that abstract type annotation, and it should also confirm that the `F <: Function` change fixes it.

1 Like
#8

I find that `@code_warntype` is not very informative here:

``````julia> @code_warntype solverest!(dp, Ldict, Cdict, A1dict, Vdict, convdict)
Body::Tuple{Array{Float64,3},Array{Float64,3},Array{Float64,3},Array{Float64,3},Array{Bool,3}}
1 ─ %1 = invoke Main.:(#solverest!#14)(1::Int64, _1::Function, _2::DP{Float64,typeof(u)}, _3::Array{Float64,3}, _4::Array{Float64,3}, _5::Array{Float64,3}, _6::Array{Float64,3}, _7::Array{Bool,3})::Tuple{Array{Float64,3},Array{Float64,3},Array{Float64,3},Array{Float64,3},Array{Bool,3}}
└──      return %1
``````

I will try `ProfileView`, but I’ve had a hard time with that tool in the past.

#9

There are several abstract types here that would kill performance. This is a first-order concern.

I find it much easier to work with named tuples as collecting sets of parameters It is less verbose and there is no chance of messing up abstract parameters. It is also easy create default values. In practice there is also no need annotate types for functions. For example:

``````function myalgorithm(params)
# use params
end
params = (T = 5, dist = LogNormal(0, 1)) # etc.
myalgorithm(params)

``````

That pattern will prevent using incorrect type annotations (because you never annotate).

It won’t specifically help performance, but you can also use the `Parameters` package:

``````using Parameters
function myalgorithm(params)
@unpack T, dist = params # extracts values from named tuples to local binding
end

default_params = @with_wk (T = 5,  LogNormal(0, 1)) # generates with defaults
params = default_params() # all defaults
params_2 = default_params(T = 10) # changes one, etc
myalgorithm(params) # etc.
``````
1 Like
#10

Thanks for the suggestion @jlperla . So instead of defining a `struct` I should simply use `NamedTuples`.

I have a question, when using plain `NamedTuples` how do you extract the values from `params`, like below?

``````function f(params)
fun = params.fun
x = params.x
return fun(x)
end

params = (fun = sin, x = pi)
f(params)
``````

Thinking ahead, would a framework using `NamedTuples` be harder to debug? I think if an error arises, it would be hard to track, no?

#11

I am not sure I understand — the only abstract type I see is `Function`. Fix that, or introduce a function barrier, and it should be fine.

`NamedTuple`s are fine for prototyping, but starting with `struct` is OK to. Introducing a type parameter for the utility function `u` should fix everything above very easily.

1 Like
#12

Constructing Linear interpolation objects is costly, and you’re doing it in every objective function call. Try constructing `interp_func(t+1)` once outside the objective and then call it within.

2 Likes
#13

Also (should have caught this first): `interp_func` is referring to your value function in global scope. You need to change it so that you pass it various matrices it needs as arguments, otherwise everything will be slow.

#14

I’m not sure I follow. I construct the interpolation outside the loops over `i` and `s`. I need to construct a new interpolation for each `t`, I thought the function `interp_func(t)` did that.

#15

Ok, that is easily fixable.

#16

Here, you’re calling `interp_func`, which does a lot of work to precalculate the slopes (I think) inside the hot loop of your optimization problem.

2 Likes
#17

I think I understand what you mean. So, instead of constructing the function outside the main loop, I can construct it inside the loop for `t`, but outside the other 2 loops. So I would end up with something like this:

``````for t in T-1:-1:1
interp_func = ... # create interpolation
for s in 1:length(grid_A)
for i in 1:n
opt = ... #<optimization calling interp_func(x)>
.....
``````

That way the `interp_func` is constructed once for each `t`. Is that what you mean?

#18

Yes exactly. But this isn’t the big problem here.

After looking at this for a bit, I think the bigger problem (and what it looks like is taking all of the time in “solving” this) is that you’re not solving it. The problem is not defined at the low end of your asset grid, since there are grid points for which there is no labor supply that ensures positive consumption. It looks like it’s taking forever because Optim is trying to be really sure that it has searched before it returns “I guess you get `-Inf` utils.” But the problem is just undefined at those points…

I bet the problem will get solved very quickly if you restrict the asset grid to just positive values. The real trick will be allowing for negative assets up to the worker’s natural borrowing limit.

#19

I think I let the full model (with `T = 65`) run for a couple of hours and the solutions involved some borrowing in early life. I will have to check that more carefully.

The good news is that with constructing the interpolating function inside the loop for `t`, I can now solve it within 50 seconds, which is a huge improvement! Thanks!

I was starting to think of constructing my own optimizer, since I read elsewhere that `Optim` is not completely type-stable, or using another parameterization to ensure the labor supply remains between 0 and 1. I thought that the logit function had tails that were too flat.

#20

What are your grid sizes? This really shouldn’t take hours? Also this is really parallelisable, if you check my github there’s an implementation of a parallel finite horizon vfi thing from my thesis (Learning Models)

2 Likes