Status of integration, optimization, and nl solvers

I’m working on a project in which, given a vector of parameters θ, I have to solve a nonlinear system of equations that involve some integrals. After solving the system, I need to optimize with respect to θ to obtain optimal parameter values that fit some data. That means that I have to solve that system of equations many times.

I have not used any packages for integration, but I have used Optim.jl and NLsolve.jl in the past. However it’s been a while since I last worked with those packages and I’m wondering if they are still the best in Julia. Also, I’m looking for suggestions of what other packages I should look at. I know that there is GalacticOptim.jl and Quadrature.jl that wrap many other optimization and integration packages, any others? And what about nl solvers? What packages should I look at that play nice with Optim or GalacticOptim, Quadrature and NLsolve?

Depending on your integrals (dimension, smoothness, etc), consider approximating with a Gaussian quadrature, eg as in

and from then on you effectively have a sum of function values with (potentially) parameter-dependent weights, in any case it should be easy to handle with any nonlinear solver.

If this is about structural estimation, I recently asked a similar question

and got some good suggestions, which I am working through. Currently I find NLopt (Augmented Lagrangian for the constraint) very robust, but that’s on a toy problem.

1 Like

Quadrature.jl is compatible with automatic differentiation, so it should play nice with the whole ML ecosystem, etc. It just composes. See for example this example of optimization under uncertainty, mixing autodiff, Quadrature.jl, DifferentialEquations.jl, event handling, multithreading, and a bit more:

http://tutorials.juliadiffeq.org/html/DiffEqUncertainty/02-AD_and_optimization.html

For a bunch of examples on ML integration, see:

https://diffeqflux.sciml.ai/dev/

1 Like

The integrals are fairly simple: \int_0^T K e^{-\rho s} ds and \int_T^\infty H(s, x) e^{-\rho s} ds. I can pick the functional form of H, but probably it will be of the form Z(s)\phi(x). I suppose these are simple enough for FastGaussianQuadrature.jl?

It is about structural estimation, but not as general as your problem. It’s not a GE model, but rather a duration model with behavioral interpretation.

That seems to call a library outside of Julia. I suppose that automatic differentiation does not work here. Also, last time I worked with Optim, it was not easily parallelizable, what about NLopt?

I’ll take a look at that, thank you.

I see that FastGaussianQuadrature integrates over a closed interval, I guess Quadrature it is then.

Yes, with (shifted) Laguerre polynomials.

You can use AD to supply the derivative, it works fine.

A lot of local algorithms are sequential and not inherently parallelizable, but you can make your objective parallelizable just fine.

1 Like

I’d like to play with that. I think it could be fastest way to integrate my functions. Can you provide a reference? Wikipedia doesn’t go in depth for the use of Laguerre polynomials in integrals.

EDIT: Found it in Judd, but if you have another reference I’d appreciate it.

If you have the Judd book I think it is the best reference, really. Miranda and Fackler has a short subsection but few details.

1 Like

GitHub - QuantEcon/Expectations.jl: Expectation operators for Distributions.jl objects is largely intended to wrap the (non-adaptive) Gaussian quadrature routines in FastGaussQuadrature.

Depending on your use case, it may be the most convenient (and all of the change-of-variables required to transform the spaces is implemented).

2 Likes

Do you use NLsolve directly or through GalacticOptim?

maybe not ready for prime time, but i have been using GitHub - JuliaNLSolvers/NLSolvers.jl: No bells and whistles foundation of Optim.jl as a replacement for both Optim (unconstrained) and NLSolve

2 Likes

It looks like your solving an econ continuous time problem.
I’ve thought about doing this in julia quite a bit.

Can you please write out your problem?

1 Like

It’s a Nash bargaining problem of time of retirement for couples (Honoré & de Paula 2018). The optimization problem is

\max _{t_{1}, t_{2}}\left(\int_{0}^{t_{1}} K_{1} e^{-\rho s} d s+\int_{t_{1}}^{\infty} H_{1}\left(s, {x}_{1}\right) D\left(s, t_{2}\right) e^{-\rho s} d s-A_{1}\right) \\ \quad \times\left(\int_{0}^{t_{2}} K_{2} e^{-\rho s} d s+\int_{t_{2}}^{\infty} H_{2}\left(s, {x}_{2}\right) D\left(s, t_{1}\right) e^{-\rho s} d s-A_{2}\right)

where D\left(s, t_{j}\right)=(\delta-1) \mathbb{1}\left(s \geq t_{j}\right)+1. The objective function simplifies to

\begin{aligned} N\left(t_{1}, t_{2}\right)=& \overbrace{\left(K_{1} \rho^{-1}\left(1-e^{-\rho t_{1}}\right)+\widetilde{H}_{1}\left(t_{1}, {x}_{i}\right)+(\delta-1) \widetilde{H}_{1}\left(\max \left\{t_{1}, t_{2}\right\}, {x}_{1}\right)-A_{1}\right)}^{\equiv I} \\ & \times \underbrace{\left(K_{2} \rho^{-1}\left(1-e^{-\rho t_{2}}\right)+\widetilde{H}_{2}\left(t_{2}, {x}_{2}\right)+(\delta-1) \widetilde{H}_{2}\left(\max \left\{t_{1}, t_{2}\right\}, {x}_{2}\right)-A_{2}\right)}_{\equiv I I} \end{aligned}

with \widetilde{H}_{i}\left(t, {x}_{i}\right)=\int_{t}^{\infty} H_{i}\left(s, {x}_{i}\right) e^{-\rho s} d s.
There are 3 cases of FOC to maximize that objective function: when t^*_1 < t^*_2, when t^*_1 > t^*_2 and when t^*_1 = t^*_2. There are known bounds on t^*_1, t^*_2 for each of those cases.

In another post I ask about how to solve those FOC with NLsolve by supplying those theoretical bounds on the optimal values t^*_1, t^*_2. I’d appreciate any suggestion you may have.

2 Likes

Actually, the solution for t_1 is independent of the solution for t_2, so I can use the (faster?) univariate methods in Roots.jl. Then I can find t^*_2 with the same univariate approach.

\tilde{H} should be an ideal candidate for quadrature with fixed nodes (just shift by t).

Personally would just write a callable that evaluates that,

struct Hquad{T,V<:AbstractVector{T}}
    nodes::V
    weights::V
    ρ::T
end

function (hq::Hquad)(t, x)
    ...
end

and unit test it (because I have a tendency to mess up simple transformations like adding t and dividing by \rho), and then it should be straightforward and work seamlessly with AD.

1 Like

Hmm, this is a very good suggestion. I had defined a function

function discountedintegral(f, r, T; nodes=5_000)
    x, w = gausslaguerre(nodes)
    exp(-r*T)/r * sum( w[i]*f(x[i]/r + T) for i in 1:nodes )
end

but that computes the nodes every time it is called. With your suggestion, it would be like this:

struct Hquad{T, V <: AbstractVector{T}}
    nodes::V
    weights::V
    r::T
end

function (hq::Hquad)(f, T)
    x = hq.nodes
    w = hq.weights
    r = hq.r
    exp(-r*T)/r * sum( w[i]*f(x[i]/r + T) for i in 1:length(x) )
end

hq = Hquad(gausslaguerre(1_000)..., 0.04)

I’d wish there was a way to prevent hardcoding the number of nodes. Perhaps with memoization?

I would just save those nodes and weights in a constant once.

And unless your H is really nasty, I think you should be fine with much less than 1000. I would start with 20; when it works, Gaussian quadrature is amazingly accurate for most purposes.

I thought the point of defining a struct was to carry those nodes and weights within the struct itself. Perhaps you mean to store the number of nodes in a const? If not, I could have saved the nodes and weights and just employ a standard function that uses those consts.

I would save a version transformed by \rho in the struct already (especially if it does not change often), but the way you propose is fine too.