Expressions vs Variables+Constraints in JuMP models

Hello, and sorry for the naive question (optimization is not my primary field of expertise). And sorry if this is already discussed in the documentation; I didn’t see it.

It seems to me that “expressions” in JuMP could always be replaced with new variables subject to equality constraints. Taking the example from the documentation:

model = Model();
@variable(model, x)
@variable(model, y)
ex = @expression(model, 2x + y - 1)
@objective(model, Min, 2 * ex - 1)

I think the following model would be equivalent:

model = Model();
@variable(model, x)
@variable(model, y)
@variable(model, z); @constraint(model, z == 2x + y - 1)
@objective(model, Min, 2 * z - 1)

And, perhaps more interestingly, it seems to me that every equality constraint in a model could instead be expressed as an expression (which would allow removing one variable).

So I guess my questions would be:

  • Am I right to think that expressions and variables + equality constraints are mathematically equivalent?
  • If so, is there a reason to prefer expressions (I’m thinking perhaps performance-wise)?
  • If so, should I look for all equality constraints in my model and convert them to expressions? (But can’t I rely on the pre-solver doing this for me?)
1 Like

the " = " in a expression is not equivalent to " == " in a constraint.

There is an immediate obvious difference between your first example

julia> print(model)
Min 4 x + 2 y - 3
Subject to

and the second

julia> print(model)
Min 2 z - 1
Subject to
 -2 x - y + z = -1

and that is that the first has no constraints and the second does, with an extra variable.
You can see this, again, for the first:

julia> num_constraints(model; count_variable_in_set_constraints = true)
0

julia> num_variables(model)
2

while for the second

julia> num_constraints(model; count_variable_in_set_constraints = true)
1

julia> num_variables(model)
3
2 Likes

Sorry for not being clear: I understand the models are not exactly the same. But I think they are “equivalent” (in the sense that solving one gives the same result as the other):

julia> let
           model = Model(HiGHS.Optimizer);
           @variable(model, 0 <= x <= 1)
           @variable(model, 0 <= y <= 1)
           ex = @expression(model, 2x + y - 1)
           @objective(model, Min, 2 * ex - 1)
           optimize!(model)
           value.((x, y, ex))
         end
Running HiGHS 1.6.0: Copyright (c) 2023 HiGHS under MIT licence terms
Presolving model
0 rows, 0 cols, 0 nonzeros
0 rows, 0 cols, 0 nonzeros
Presolve : Reductions: rows 0(-0); columns 0(-2); elements 0(-0) - Reduced to empty
Solving the original LP from the solution after postsolve
Solving an unconstrained LP with 2 columns
Model   status      : Optimal
Objective value     : -3.0000000000e+00
HiGHS run time      :          0.00
(0.0, 0.0, -1.0)
julia> let
           model = Model(HiGHS.Optimizer);
           @variable(model, 0 <= x <= 1)
           @variable(model, 0 <= y <= 1)
           @variable(model, z); @constraint(model, z == 2x + y - 1)
           @objective(model, Min, 2 * z - 1)
           optimize!(model)
           value.((x, y, z))
       end
Running HiGHS 1.6.0: Copyright (c) 2023 HiGHS under MIT licence terms
Presolving model
0 rows, 2 cols, 0 nonzeros
0 rows, 0 cols, 0 nonzeros
Presolve : Reductions: rows 0(-1); columns 0(-3); elements 0(-3) - Reduced to empty
Solving the original LP from the solution after postsolve
Model   status      : Optimal
Objective value     : -3.0000000000e+00
HiGHS run time      :          0.00
(0.0, 0.0, -1.0)

If I read the solver output correctly, I think the pre-solver used the equality constraint in the second model to remove the z variable… Is that not what happens?

1 Like

Yes, you’re pretty much correct. The complication is that different solvers will do a presolve specific to themselves, depending on what structure is recognised by their presolve algorithm. The difference when using JuMP expressions is that the transformation is happening at the modeling layer (JuMP and MathOptInterface) not the solver.

5 Likes

Thanks a lot, that does clarify things. My understanding is then that it’s “better” to use expressions when one can, because:

  • the resulting model is “tight” by construction, instead of relying on pre-solving techniques which will vary from solver to solver ;
  • even when the solver does implement the relevant presolve, I’d assume its faster for JuMP/MOI to substitute expressions directly at model construction, than it is for a pre-solver to (1) recognise an expression in the model structure and (2) substitute it ex post.

If I may ask a related follow-up question: suppose that in a larger model, I have something like the:

model = Model()
@variable(model, 0 <= x[1:100] <= 1)
@constraint(model, sum(x) == 1)

From a mathematical standpoint, it is possible to to express x[100] as 1-sum(x[1:99]) and remove it from the model altogether (along with the associated constraint). Although I don’t know (yet) how to write it with JuMP in practice, I’m confident I could do it. But that would probably at the cost of a less easily readable model construction (for example I would lose the IMO nice property that all x[i] play symmetric roles in the model).

Would that be worth it in your opinion / experience?

1 Like

My understanding is then that it’s “better” to use expressions when one can, because

Not necessarily.

One thing that hasn’t been mentioned is sparsity. Sometimes (quite often, actually), it can be better to keep additional variables if it makes the problem sparse. Substituting in the expressions into constraints can lead to rows in the constraint matrix with a lot of non-zero terms. In some cases, you’re better off adding the intermediate variables.

One common example where this is very apparent is least-squares regression:

function solve_constrained_least_squares_regression(A::Matrix, b::Vector)
    m, n = size(A)
    model = Model(Ipopt.Optimizer)
    @variable(model, x[1:n])
    @variable(model, residuals[1:m])
    @constraint(model, residuals == A * x - b)
    @objective(model, Min, sum(residuals.^2))
    optimize!(model)
    return value.(x)
end

If we made residuals an @expression instead of a @variable, then the object function would be a dense quadratic with all pairwise combinations of variables.

By using the @variable, we have some linear constraints and a quadratic objective that has only residuals[i]^2 terms (the Q matrix is diagonal).

As with most things, there is a trade-off. You could try both ways and see what works. Usually I use @expression for simple building blocks, and I use @variable for slightly complicated expressions, particularly if they will lead to a sparser problem. But there is no exact rule.

3 Likes