Inconsistent solutions from JuMP vs InfiniteOpt for discretized optimal control

Hello everyone,

I have been playing around with JuMP recently to understand the structure of an optimal control problem. For clarity, the problem is the following. I want to find the function x(c,\theta) such that \forall c,\theta \in [0,1]^2 x(c,\theta) \in [0,1] and x is such that

\max_{x} \int_0^1 \int_0^1\big(2\theta-c-1\big)x(c,\theta)d\theta dc

subject to the following constraint for all \theta \in [0,1]:

\int_0^1 \Big[ \big(2c-\theta \big)x(c,\theta) + \int_0^\theta x(c,t)dt\Big] dc =0

My first attempt has been to write up a manually discretized version of the maximization problem using JuMP. Interpret the integrals as expectations over the uniform distribution. Here’s a minimal working example.

using JuMP, HiGHS
types = 11
prob = ones(types)./types;
theta, c = [collect(0:(types-1)^(-1):1) for i=1:2]

lp = Model(HiGHS.Optimizer)

@variable(lp,0<=x[1:types,1:types]<=1)

@constraint(lp, EPICconst[j = 1:types] , sum(prob[i]*((2*c[i] - theta[j])*x[i,j] 
                    + sum(prob[k]*x[i,k] for k in 1:j)) for i in 1:types)==0);

@objective(lp, Max, sum(prob[j]*sum(prob[i]*(2*theta[j] - c[i] -1)*x[i,j] 
                        for i in 1:types) for j in 1:types));

optimize!(lp);

This model solves very quickly and returns an optimal solution. I ran this code with various values of types, up to 41, to get a sense of whether small discretizations were affecting the result. The results seem to be very consistent: the matrix representing the function x(c,\theta) is pretty much unaltered by changing the types variable.

Now, because this is an optimal control problem, I thought to try and use the InfiniteOpt.jl package. The same problem is quite simple to formulate, once one observes that we can write \int_0^\theta x(c,t) dt = X(c,\theta), and thus \frac{\partial X}{\partial \theta}(c,\theta) = x(c,\theta).
Below a minimal working example again, this time using InfiniteOpt.jl

using InfiniteOpt, HiGHS
types = 11

model = InfiniteModel(HiGHS.Optimizer)

@infinite_parameter(model, c in [0,1], num_supports = types)
@infinite_parameter(model, theta in [0,1], num_supports = types);

@variable(model, 0<=x<=1, Infinite(c,theta), start = 10)
@variable(model, 0<=X, Infinite(c,theta), start=10)

@constraint(model, integral((2*c-theta)*x + X, c) == 0)
@constraint(model, x == @deriv(X,theta))

@objective(model, Max, integral(integral((2*theta-1-c)*x,c),theta))
optimize!(model)

This is even simpler to write, and observe I did not provide boundary conditions for X(c,\theta) since I do not care about the value of X, only of x(c,\theta). The solver finds a solution, I assume by discretizing according to the types variable. However, the solution is different from the one found by manually discretizing and then solving with JuMP! In fact, it is worse - the optimal value of the objective is ~10\% worse than the objective of the manually discretized model.

I can’t wrap my head around it. Does the issue have anything to do with discrete vs. continuous control? Or is it something that happens inside the optimization procedure and I shouldn’t trust InfiniteOpt?

Sorry for the long post, and thank you in advance!

1 Like

This seems like a question for @pulsipher. I assume your infinite opt model isn’t quite right, but I don’t know the details. The JuMP model looks good.

1 Like

At first glance, one difference is that your JuMP model approximates the integrals via a sampling scheme whereas InfiniteOpt uses trapezoid rule by default. Moreover, the differential equation will be evaluated using implicit Euler by default.

Hence, the discretized InfiniteOpt model ends up being setup quite a bit differently than the JuMP model you’ve defined. I suspect InfiniteOpt’s formulation provides a more accurate solution relative to the continuous case, but I will need to take a closer look. I should have some time later tomorrow.

2 Likes

Thank you both for your answers!

I suspect that the optimal x is such that x(c,\theta) \in \{0,1\} almost everywhere. This is apparent in the unconstrained problem, and I think the constraints do not change the structure. I may be wrong, but for a step function the trapezoid method seems worse than the sampling scheme I use in the JuMP solution.

For the same reasons there may be a problem with the derivative, if it is assumed that x is smooth. Curious to hear whether there are better approximation techniques for step functions - or if there is a way to write the constraint in the continuous model without defining a new variable X.

1 Like

Ok, I tried a lot of things and this is what I found.

With your JuMP.jl model, I get:

julia> value.(x)
11×11 Matrix{Float64}:
 0.0  0.0  0.0  0.0  0.0  -0.0  -0.0  1.0        1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0        1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0        1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0        1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0        1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0        1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0        1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  0.0804598  0.173624  0.283987  0.418876
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  0.0        0.0       0.0       0.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  0.0        0.0       0.0       0.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  0.0        0.0       0.0       0.0

julia> objective_value(lp)
0.093492240953056

So, we do get some values that at not 0 or 1.

With your InfiniteOpt.jl model, I get:

julia> value(x, ndarray = true)
11×11 Matrix{Float64}:
 0.0  0.0  0.0  0.0  -0.0  -0.0  -0.0   1.0  1.0  1.0  1.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   1.0  1.0  1.0  1.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   1.0  1.0  1.0  1.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   1.0  1.0  1.0  1.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   1.0  1.0  1.0  1.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   1.0  1.0  1.0  1.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   0.5  0.5  0.5  0.5
 0.0  0.0  0.0  0.0   0.0   0.0   0.0  -0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0  -0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   0.0  0.0  0.0  0.0

julia> objective_value(model)
0.07500000000000001

So, our choice of approximation schemes definitely affects the solution we get. It’s interesting here to see that the non 0 or 1 values are exactly 0.5.

Now let’s see what happens when we formulate the InfiniteOpt model without the differential equation, but instead use nested integrals like your JuMP model. Having an infinite parameter as an integral bound is not a use case that we considered for InfiniteOpt, so this is not supported currently. We can hack a little bit around this and just use a manual sum like you did in JuMP.

using InfiniteOpt, HiGHS
types = 11

model = InfiniteModel(HiGHS.Optimizer)

@infinite_parameter(model, c ∈ [0,1], num_supports = types)
@infinite_parameter(model, θ ∈ [0,1], num_supports = types);

@variable(model, 0 ≤ x ≤ 1, Infinite(c, θ), start = 10)

θs = supports(θ)
@constraint(model, [j ∈ eachindex(θs)], ∫((2c - θs[j])*x(c, θs[j]) + sum(1/types * x(c, θs[k]) for k ∈ 1:j), c) == 0)

@objective(model, Max, ∫(∫((2θ - 1 - c)*x, c), θ))
optimize!(model)

Now we get:

julia> value(x, ndarray = true)
11×11 Matrix{Float64}:
 0.0  0.0  0.0  0.0  0.0  -0.0  -0.0  1.0       1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0       1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0       1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0       1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0       1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0       1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  0.592308  0.705128  0.849434  1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  0.0       0.0       0.0       0.0283575
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  0.0       0.0       0.0       0.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  0.0       0.0       0.0       0.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  0.0       0.0       0.0       0.0

julia> objective_value(model)
0.07655678791492745

So, we are still different than the JuMP code using trapezoid rule in the remaining integrals.

Ok, let’s now force InfiniteOpt.jl to use a the same simple averaging scheme using support_sum:

using InfiniteOpt, HiGHS
types = 11
prob = 1 / types

model = InfiniteModel(HiGHS.Optimizer)

@infinite_parameter(model, c ∈ [0,1], num_supports = types)
@infinite_parameter(model, θ ∈ [0,1], num_supports = types);

@variable(model, 0 ≤ x ≤ 1, Infinite(c, θ), start = 10)

θs = supports(θ)
@constraint(model, [j ∈ eachindex(θs)], support_sum(prob * ((2c - θs[j])*x(c, θs[j]) + sum(prob * x(c, θs[k]) for k ∈ 1:j)), c) == 0)

@objective(model, Max, support_sum(prob * support_sum(prob * (2θ - 1 - c)*x, c), θ))
optimize!(model)

Now we get:

julia> value(x, ndarray = true)
11×11 Matrix{Float64}:
 0.0  0.0  0.0  0.0  0.0  -0.0  -0.0  1.0        1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0        1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0        1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0        1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0        1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0        1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  1.0        1.0       1.0       1.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  0.0804598  0.173624  0.283987  0.418876
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  0.0        0.0       0.0       0.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  0.0        0.0       0.0       0.0
 0.0  0.0  0.0  0.0  0.0   0.0   0.0  0.0        0.0       0.0       0.0

julia> objective_value(model)
0.093492240953056

Which is exactly the same since the JuMP model and the discretized InfiniteOpt model are equivalent. So, the difference in integral approximation seems to be the chief difference.

Ok, now let’s add the differential equation back in:

using InfiniteOpt, HiGHS
types = 11
prob = 1 / types

model = InfiniteModel(HiGHS.Optimizer)

@infinite_parameter(model, c ∈ [0,1], num_supports = types)
@infinite_parameter(model, θ ∈ [0,1], num_supports = types);

@variable(model, 0 ≤ x ≤ 1, Infinite(c, θ), start = 10)
@variable(model, 0 ≤ X, Infinite(c, θ), start =10)

@constraint(model, support_sum(prob * ((2c - θ)*x + X), c) == 0)
@constraint(model, x == ∂(X, θ))

@objective(model, Max, support_sum(prob * support_sum(prob * (2θ - 1 - c)*x, c), θ))
optimize!(model)

With this, now we get:

value(x, ndarray = true)
11×11 Matrix{Float64}:
 0.0  0.0  0.0  0.0  -0.0  -0.0  -0.0   1.0   1.0   1.0   1.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   1.0   1.0   1.0   1.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   1.0   1.0   1.0   1.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   1.0   1.0   1.0   1.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   1.0   1.0   1.0   1.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   1.0   1.0   1.0   1.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   1.0   1.0   1.0   1.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0  -0.0  -0.0  -0.0  -0.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0  -0.0   0.0   0.0   0.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0  -0.0   0.0   0.0   0.0
 0.0  0.0  0.0  0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0

julia> objective_value(model)
0.09256198347107437

Which is much closer to the JuMP solution relative to the original InfiniteOpt formulation and now it only gets 0s or 1s. So, it seems the principle difference between your formulations was the choice of integral approximation scheme.

Interestingly, we can use other approximation schemes for integrals in InfiniteOpt and we can try using a high-fidelity quadrature scheme FEGaussLobatto:

using InfiniteOpt, HiGHS
types = 11
prob = 1 / types

model = InfiniteModel(HiGHS.Optimizer)

@infinite_parameter(model, c ∈ [0,1], num_supports = types)
@infinite_parameter(model, θ ∈ [0,1], num_supports = types);

@variable(model, 0 ≤ x ≤ 1, Infinite(c, θ), start = 10)
@variable(model, 0 ≤ X, Infinite(c, θ), start = 10)

# change default of integral evaluation
set_uni_integral_defaults(eval_method = FEGaussLobatto(), num_nodes = 5)

@constraint(model, ∫((2c - θ)*x + X, c) == 0)
@constraint(model, x == ∂(X, θ))

@objective(model, Max, ∫(∫((2θ - 1 - c)*x, c), θ))
optimize!(model)

This gets:

julia> value(x, ndarray = true)
11×11 Matrix{Float64}:
 0.0  0.0  0.0   0.0  -0.0  -0.0  0.0   1.0  1.0  1.0  1.0
 0.0  0.0  0.0   0.0   0.0   0.0  0.0   1.0  1.0  1.0  1.0
 0.0  0.0  0.0   0.0   0.0   0.0  0.0   1.0  1.0  1.0  1.0
 0.0  0.0  0.0   0.0   0.0   0.0  0.0   1.0  1.0  1.0  1.0
 0.0  0.0  0.0   0.0   0.0   0.0  0.0   1.0  1.0  1.0  1.0
 0.0  0.0  0.0   0.0   0.0   0.0  0.0   1.0  1.0  1.0  1.0
 0.0  0.0  0.0   0.0   0.0   0.0  0.0   1.0  1.0  1.0  1.0
 0.0  0.0  0.0   0.0   0.0   0.0  0.0  -0.0  0.0  0.0  0.0
 0.0  0.0  0.0   0.0   0.0   0.0  0.0  -0.0  0.0  0.0  0.0
 0.0  0.0  0.0   0.0   0.0   0.0  0.0  -0.0  0.0  0.0  0.0
 0.0  0.0  0.0  -0.0   0.0   0.0  0.0  -0.0  0.0  0.0  0.0

julia> objective_value(model)
0.07404655597983774

Interestingly, this recovers the sample formulation if we choose num_nodes=2, but there currently is a bug that I need to fix to run it.

Finally, if I run the JuMP model but let types=100, we get:

julia> objective_value(lp)
0.07602375049463848

Which is close to what InfiniteOpt obtains with only types=11.

So, at the end of day we can clearly see that our choice of approximation affects the solution we get. Given the performance of the discrete JuMP model as we increase the number of supports, it seems that the default approximation schemes used by InfiniteOpt are more accurate relative to the continuous time case.

3 Likes