I’m working on a project in which, given a vector of parameters θ, I have to solve a nonlinear system of equations that involve some integrals. After solving the system, I need to optimize with respect to θ to obtain optimal parameter values that fit some data. That means that I have to solve that system of equations many times.
I have not used any packages for integration, but I have used Optim.jl and NLsolve.jl in the past. However it’s been a while since I last worked with those packages and I’m wondering if they are still the best in Julia. Also, I’m looking for suggestions of what other packages I should look at. I know that there is GalacticOptim.jl and Quadrature.jl that wrap many other optimization and integration packages, any others? And what about nl solvers? What packages should I look at that play nice with Optim or GalacticOptim, Quadrature and NLsolve?
Depending on your integrals (dimension, smoothness, etc), consider approximating with a Gaussian quadrature, eg as in
and from then on you effectively have a sum of function values with (potentially) parameter-dependent weights, in any case it should be easy to handle with any nonlinear solver.
If this is about structural estimation, I recently asked a similar question
and got some good suggestions, which I am working through. Currently I find NLopt (Augmented Lagrangian for the constraint) very robust, but that’s on a toy problem.
Quadrature.jl is compatible with automatic differentiation, so it should play nice with the whole ML ecosystem, etc. It just composes. See for example this example of optimization under uncertainty, mixing autodiff, Quadrature.jl, DifferentialEquations.jl, event handling, multithreading, and a bit more:
The integrals are fairly simple: \int_0^T K e^{-\rho s} ds and \int_T^\infty H(s, x) e^{-\rho s} ds. I can pick the functional form of H, but probably it will be of the form Z(s)\phi(x). I suppose these are simple enough for FastGaussianQuadrature.jl?
It is about structural estimation, but not as general as your problem. It’s not a GE model, but rather a duration model with behavioral interpretation.
That seems to call a library outside of Julia. I suppose that automatic differentiation does not work here. Also, last time I worked with Optim, it was not easily parallelizable, what about NLopt?
I’d like to play with that. I think it could be fastest way to integrate my functions. Can you provide a reference? Wikipedia doesn’t go in depth for the use of Laguerre polynomials in integrals.
EDIT: Found it in Judd, but if you have another reference I’d appreciate it.
with \widetilde{H}_{i}\left(t, {x}_{i}\right)=\int_{t}^{\infty} H_{i}\left(s, {x}_{i}\right) e^{-\rho s} d s.
There are 3 cases of FOC to maximize that objective function: when t^*_1 < t^*_2, when t^*_1 > t^*_2 and when t^*_1 = t^*_2. There are known bounds on t^*_1, t^*_2 for each of those cases.
In another post I ask about how to solve those FOC with NLsolve by supplying those theoretical bounds on the optimal values t^*_1, t^*_2. I’d appreciate any suggestion you may have.
Actually, the solution for t_1 is independent of the solution for t_2, so I can use the (faster?) univariate methods in Roots.jl. Then I can find t^*_2 with the same univariate approach.
\tilde{H} should be an ideal candidate for quadrature with fixed nodes (just shift by t).
Personally would just write a callable that evaluates that,
struct Hquad{T,V<:AbstractVector{T}}
nodes::V
weights::V
ρ::T
end
function (hq::Hquad)(t, x)
...
end
and unit test it (because I have a tendency to mess up simple transformations like adding t and dividing by \rho), and then it should be straightforward and work seamlessly with AD.
Hmm, this is a very good suggestion. I had defined a function
function discountedintegral(f, r, T; nodes=5_000)
x, w = gausslaguerre(nodes)
exp(-r*T)/r * sum( w[i]*f(x[i]/r + T) for i in 1:nodes )
end
but that computes the nodes every time it is called. With your suggestion, it would be like this:
struct Hquad{T, V <: AbstractVector{T}}
nodes::V
weights::V
r::T
end
function (hq::Hquad)(f, T)
x = hq.nodes
w = hq.weights
r = hq.r
exp(-r*T)/r * sum( w[i]*f(x[i]/r + T) for i in 1:length(x) )
end
hq = Hquad(gausslaguerre(1_000)..., 0.04)
I’d wish there was a way to prevent hardcoding the number of nodes. Perhaps with memoization?
I would just save those nodes and weights in a constant once.
And unless your H is really nasty, I think you should be fine with much less than 1000. I would start with 20; when it works, Gaussian quadrature is amazingly accurate for most purposes.
I thought the point of defining a struct was to carry those nodes and weights within the struct itself. Perhaps you mean to store the number of nodes in a const? If not, I could have saved the nodes and weights and just employ a standard function that uses those consts.