Include fixed point in JuMP


I am using JuMP to maximize a nonlinear function. The function takes two vector inputs: d and p. The dimension of p is much smaller than the dimension of d.

The trick is that the value of d is determined by p. Given any value of p, I can use the fixed_point function to solve for d. I want to know how I can calculate p in JuMP. I tried to rewrite the fixed point as a constraint, but it takes too long to solve the function because it searches over the entire parameter space of d and p.

Any advice is deeply appreciated!

function fixed_point(d, p, Xmat, Zmat, u, tol=1e-5, maxiter=1e5)
    d_old = d
    normdiff = Inf
    iter = 0
    while normdiff > tol && iter <= maxiter
        s = calc_s(d, p, Xmat, Zmat)
        d_new = d_old + u - s
        normdiff = norm(d_new - d_old)
        d_old = d_new
        iter += 1
    return d_old

Maybe the syntax of Optimization.jl is more flexible for this one?

And if you need to take gradients of your fixed point shenanigans, I’ve heard ImplicitDifferentiation.jl is what you need. Disclaimer: I happen to know who wrote it

1 Like

Can you give more information on the mathematical form of your functions and what they are expressing?

Thank you! I have a version using Optimization.jl. I want to try JuMP version to see which one runs faster.

1 Like

Sure. Maybe it is helpful to post my code. I want to maximize over theta and delta, the dimension of delta is very large (>500). I am expressing the fixed point using the @NLconstraint right now. I hope I can somehow back out the value of delta so that I only need to maximize over theta. Thank you!

using JuMP, Ipopt

m = Model(Ipopt.Optimizer)

@variable(m, θ[1:4])

@variable(m, δ[1:H])

mu = @expression(m, X * reshape(θ, (K, L)) * Z')

v = @expression(m, δ .+ mu)

ev = @NLexpression(m, [h = 1:H, i = 1:N], exp(v[h, i]))

evsum = @NLexpression(m, [i = 1:N], sum(ev[h, i] for h = 1:H) + 1)

sih = @NLexpression(m, [h = 1:H, i = 1:N], ev[h, i] / evsum[i])

ln_sih = @NLexpression(m, [h = 1:H, i = 1:N], log(sih[h, i]))

sh = @NLexpression(m, [h = 1:H], sum(sih[h, i] for i = 1:N)/N)

@NLconstraint(m, [h = 1:H], abs(obs_s_h[h] - sh[h]) <= 0.0001)

    sum(Ymt[h, i] * ln_sih[h, i] for h = 1:H, i = 1:N)
@time optimize!(m)

Do you have a log of the Ipopt solve? Is it taking too long to solve? You generally shouldn’t worry about implicitly defined variables.

One thing to try would be to replace the abs constraint with:

@variable(m, -0.0001 <= abs_diff[1:H] <= 0.0001)
@NLconstraint(m, [h = 1:H], obs_s_h[h] - sh[h] == abs_diff[h])

Ipopt assumes functions are smooth and twice differentiable, so it struggles with abs at the kink.


Thank you for the suggestion! The way to rewrite the constraint reduces the total runtime by 75%!

I still think it takes too long to solve… I am currently testing the code using a 0.1% sample of my final dataset and I only include a small number of parameters.

My current estimation uses 4 theta, 29 delta, and 1266 observations (the dimension N). My final estimation will have 50 theta, 600 delta, and 8 million N. I am afraid that the total estimation time will explode exponentially once I enlarge the parameter space and increase N.

So I want to find ways to improve the speed as much as possible. I tried the GAMS solver and the performance was not so much better than Ipopt. And it is not straightforward to me how to use parallelization or GPU to optimize the code.

This is Ipopt version 3.14.13, running with linear solver MUMPS 5.6.0.

Number of nonzeros in equality constraint Jacobian...:      986
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:    16830

Total number of variables............................:       62
                     variables with only lower bounds:        0
                variables with lower and upper bounds:       29
                     variables with only upper bounds:        0
Total number of equality constraints.................:       29
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0 -4.3059159e+03 1.01e-01 9.98e+01  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1 -4.2931883e+03 1.00e-01 3.32e+02  -1.0 3.98e+00    -  1.00e+00 5.89e-03f  1
   2 -5.8695800e+03 8.61e-02 5.75e+01  -1.0 6.73e+00    -  1.00e+00 1.00e+00h  1
   3 -5.9840057e+03 8.26e-02 1.01e+04  -1.0 1.26e+01    -  9.79e-02 1.95e-02H  1
   4 -5.3779025e+03 7.86e-02 5.77e+03  -1.0 4.84e+01    -  7.02e-02 5.18e-02f  1
   5 -4.4302955e+03 6.14e-02 3.70e+03  -1.0 7.28e+00    -  2.28e-01 2.58e-01f  1
   6 -3.6356193e+03 7.71e-03 5.36e+02  -1.0 1.67e+00    -  9.14e-01 1.00e+00f  1
   7 -3.6277681e+03 5.51e-03 9.49e+02  -1.0 3.50e-01    -  1.00e+00 2.88e-01f  1
   8 -3.6107667e+03 2.29e-04 3.11e+00  -1.0 2.42e-01    -  1.00e+00 1.00e+00f  1
   9 -3.6104567e+03 7.32e-06 1.89e+00  -1.7 3.82e-02    -  9.57e-01 1.00e+00f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10 -3.6092415e+03 1.63e-06 3.55e+00  -1.7 5.14e-02    -  1.00e+00 7.89e-01f  1
  11 -3.6090828e+03 1.80e-08 1.82e-05  -1.7 7.80e-03    -  1.00e+00 1.00e+00f  1
  12 -3.6077895e+03 5.94e-07 2.03e+00  -3.8 4.61e-02    -  8.53e-01 1.00e+00f  1
  13 -3.6071906e+03 1.60e-07 1.33e-04  -3.8 2.33e-02    -  1.00e+00 1.00e+00f  1
  14 -3.6071694e+03 3.65e-10 2.15e-03  -5.7 1.08e-03    -  9.91e-01 1.00e+00f  1

Number of Iterations....: 14

                                   (scaled)                 (unscaled)
Objective...............:   7.6543355921610851e+02   -3.6071693861925792e+03
Dual infeasibility......:   2.1494505111832041e-03    1.0129464520756635e-02
Constraint violation....:   3.6544832360729720e-10    3.6544832360729720e-10
Variable bound violation:   3.7365261549183060e-09    3.7365261549183060e-09
Complementarity.........:   6.4517833702301653e-06    3.0404566378399342e-05
Overall NLP error.......:   9.6610738129587401e-04    1.0129464520756635e-02

Number of objective function evaluations             = 16
Number of objective gradient evaluations             = 15
Number of equality constraint evaluations            = 16
Number of inequality constraint evaluations          = 0
Number of equality constraint Jacobian evaluations   = 15
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations             = 14
Total seconds in IPOPT                               = 166.822

EXIT: Optimal Solution Found.
173.419688 seconds (9.88 M allocations: 2.795 GiB, 0.80% gc time, 0.31% compilation time)

I imagine you are trying to fit a multinomial logit model via MLE with some kind of aggregate market share constraint? I would start by simplifying expressions involving logistic probabilities. This is because directly computing s_{ih} = \frac{\exp(\delta_h + x_{ih}'\theta)}{1 + \sum_{j=1}^H \exp(\delta_j + x_{ij}'\theta)} and \log(s_{ih}) and their derivatives can cause numerical problems when exp’s become very large. For instance, notice that

\log(s_{ih}) = (\delta_h + x_{ih}'\theta) - \log \left(1 + \sum_{j=1}^H \exp(\delta_j + x_{ij}'\theta) \right)

so we have

\sum_{h=1}^H Y_{ih} \log(s_{ih}) = \left[ \sum_{h=1}^H Y_{ih} (\delta_h + x_{ih}'\theta) \right] - \log \left(1 + \sum_{j=1}^H \exp(\delta_j + x_{ij}'\theta) \right)

This is much easier to deal with and commonly used for multinomial logit estimation (check page 6 of these lecture notes for example). If you still have issues after simplifying things, use the functions in LogExpFunctions.jl for the remaining expressions involving log’s and exp’s so that they are numerically stable.

Maybe you can also just use the equality constraint N * s_h^{obs} = \sum_{i=1}^N s_{hi} and let the solver worry about satisfying that with some tolerance. By the way, we have s^{obs} \neq \frac{1}{N} \sum_{i=1}^N Y_i right? Otherwise, I’m not sure if this constraint is statistically meaningful for MLE.

1 Like

Just throwing a random idea but maybe seeing this as a bilevel problem can help?

Can you provide a reproducible example i can copy paste? It shouldn’t take 3 minutes for 14 iterations.

1 Like

Hi @odow, I’ve uploaded the data here. I used the wrong data in my previous exercise. Now it is taking 10 mins to run…

Thank you!

data = load("data.jld")["data"]
using JuMP, Ipopt
function ipopt_jump(data, iter=1000, tol=1e-3)

    X, Z, Y, obs_mkt_share = data
    N, L = size(Z)
    H, K = size(X)
    P = K*L

    m = Model(Ipopt.Optimizer)
    set_attributes(m, "tol" => tol, "max_iter" => iter)

    @variable(m, θ[1:P])
    @variable(m, δ[1:H])
    mu = @expression(m, X * reshape(θ, (K, L)) * Z')
    v = @expression(m, δ .+ mu)
    ev = @NLexpression(m, [h = 1:H, i = 1:N], exp(v[h, i]))
    evsum = @NLexpression(m, [i = 1:N], sum(ev[h, i] for h = 1:H) + 1)
    sih = @NLexpression(m, [h = 1:H, i = 1:N], ev[h, i] / evsum[i])
    ln_sih = @NLexpression(m, [h = 1:H, i = 1:N], log(sih[h, i]))

    # fixed point constraint
    @variable(m, -0.0001 <= abs_diff[1:H] <= 0.0001)
    sh = @NLexpression(m, [h = 1:H], sum(sih[h, i] for i = 1:N)/N)
    @NLconstraint(m, [h = 1:H], obs_mkt_share[h] - sh[h] == abs_diff[h])

        sum(choice[h, i] * ln_sih[h, i] for h = 1:H, i = 1:N)

    return (objective_value(m), value.(θ), value.(δ))