How to rectify a `v::Vector{Float64}` so that `sum(v) == 0`?

I have a vector v::Vector{Float64} which satisfies isapprox(sum(v), 0; atol = 1e-12).
Now I want to rectify the v somehow so that sum(v) == 0 becomes true.
Is there a nice method?

The following is an example where my method succeed.

julia> import Random; Random.seed!(3234);

julia> N = 10000; v = rand(N) .- .5;

julia> v = v .- sum(v) / N;

julia> abs(sum(v)) < 1e-12 && println("this is a proper test `v`")
this is a proper test `v`

julia> v[end] -= sum(v) # rectify!
-0.28060339355754454

julia> sum(v)
0.0

julia> ans == 0
true

But is it good?

Why do you need to do this? You don’t generally expect to get exact answers from floating point arithmetic so being approximately zero should be acceptable for any reasonable algorithm.

And in fact the result of sum(v) depends heavily on the particular summation algorithm used (i.e. the order of summation) which can vary between Julia versions so this isn’t a particularly well-posed problem to start with.

6 Likes

In optimization, I work with a solver.
suppose I added a constraint @constraint(model, sum(v) == 0).
But after v_val = value.(v), I can find that sum(v_val) != 0 but it is a rather small number.

Therefore I want to rectify this v_val a little bit so that sum(v_val) becomes 0. This will make the subsequent algorithms more stable.

Can the constraint just be that abs(sum(v)) <= tol?

4 Likes

Yes, you’re right.

Yes, appears it can.

The tolerance is in between the 2 tests below.

julia> using JuMP, Gurobi

julia> function test(v)                                                                                                       
           model = Model(Gurobi.Optimizer)                                                                                    
           @variable(model, x) # a free decision                                                                              
           @objective(model, Min, sum(v)x)                                                                                    
           optimize!(model)                                                                                                   
           @info termination_status(model)                                                                                    
       end;

julia> v = [-3, 3 + 1e-6] # the 1st test vector
2-element Vector{Float64}:
 -3.0
  3.000001

julia> sum(v)
1.000000000139778e-6

julia> test(v) # this test fails
[ Info: DUAL_INFEASIBLE

julia> v = [-3, 3 + 9e-7] # the 2nd test vector
2-element Vector{Float64}:
 -3.0
  3.0000009

julia> sum(v)
8.999999998593466e-7

julia> test(v) # this is usable in practice
[ Info: OPTIMAL

Besides covering up the tolerances of the solver and the dependence on the elements’ or the summation order, there’s a 3rd reason not to do this: if the arbitrarily chosen end element is very small, the rectifying value can actually significantly change its magnitude or sign, which you may not necessarily get away with.

julia> x = [1e5, -1e5, 1e-11]; x[end]
1.0e-11

julia> x[end] -= sum(x); x[end]
0.0

julia> x = [1e5 + eps(1e5), -1e5, 1e-11]; x[end]
1.0e-11

julia> x[end] -= sum(x); x[end]
-1.455191522836685e-11
1 Like

change its sign

I see.

But from another perspective

julia> x = [1e5 + eps(1e5), -1e5, 1e-11]
3-element Vector{Float64}:
  100000.00000000001
 -100000.0
       1.0e-11

julia> x[end] -= sum(x); sum(x)
1.6155871338926322e-27

julia> x[end] -= sum(x); sum(x) # success
0.0

julia> x
3-element Vector{Float64}:
  100000.00000000001
 -100000.0
      -1.4551915228366852e-11

Maybe a better approach (maybe meaningless in practice) was

julia> x = [1e5 + eps(1e5), -1e5, 1e-11]
3-element Vector{Float64}:
  100000.00000000001
 -100000.0
       1.0e-11

julia> sum(x)
2.455191522836685e-11

julia> x[findmax(abs.(x))[2]] -= sum(x); sum(x)
-4.5519152283668524e-12

julia> x[findmax(abs.(x))[2]] -= sum(x); sum(x) # get stuck
-4.5519152283668524e-12

Any generic equality constraint in any optimization algorithm will have to have some kind of tolerance attached to it. You shouldn’t need to explicitly rewrite it as abs(sum(v)) ≤ tol, and indeed this transformation may be harmful to the convergence.

2 Likes

I didn’t even notice that subtracting the sum one time didn’t get the new sum to 0.0 because I was rechecking x[end], not sum(x). A second time reached sum(x) == 0.0 but it still inverted the sign of x[end]. That just goes to show how uncertain playing with floating point precision is.

I’ve realized this point.
A x == y constraint and a @variable(model, x, Int) constraint are both approximate (loosely feasible is deemed feasible).

A related question is: if both modeling approach are valid, e.g.

JuMP.@variable(model, x)
JuMP.@variable(model, epi_var)
JuMP.@constraint(model, epi_var == x)
JuMP.@objective(model, Min, epi_var)

and

JuMP.@variable(model, x)
JuMP.@variable(model, epi_var)
JuMP.@constraint(model, epi_var >= x)
JuMP.@objective(model, Min, epi_var)

are mathematically equivalent.
Which style should we opt to write in practice?
From the perspective of Nonlinear Optimization, the constraint epi_var == x is easier, because a == constraint is always easier than a >= constraint—in terms of associating with Lagrangian multiplier. I don’t know in practice whether Gurobi will rewrite this == constraint into 2 “>=” constraints with the opposite direction.

But from the perspective of the solver, e.g. Gurobi, according to my experience, the >= constraint will make Gurobi more robust, because the associated dual variable is a positive one. (Gurobi always prefer signed variables to free variables.)

Inequality constraints are usually way easier than equality constraints, in my experience, because it is much easier for the solver to stay in a feasible set with a nonempty interior, and there are more algorithms available.

The exception is an equality constraint that is simple enough for you to algebraically solve it and exactly eliminate a variable a priori, giving the solver a lower-dimensional problem to deal with.

The exception is an equality constraint that is simple enough for you to algebraically solve it and exactly eliminate a variable a priori , giving the solver a lower-dimensional problem to deal with.

I can understand this “elimination”, e.g. @constraint(model, x[1] + x[2] + x[3] == 2) yields x[3] := 2 - x[1] - x[2] so that we get rid of x[3].
But aside from this presolve technique, equality constraints are also simpler. The philosophy of Nonlinear Programming consists in the conversion of a harder problem to a simpler problem.
Usually, we convert an inequality constrained problem to an equality constrained problem, then via Lagrangian multipliers, to an unconstrained optimization problem, finally solving a system of linear equations involving both primal decisions and Lagrangian multipliers.



From the Book “Nonlinear Programming”.

Gurobi’s algorithm for solving (large-scale) Linear Programming is also the barrier method, which falls into the category of Nonlinear Programming optimization algorithms.

But yes, from practical experience, equality constraints appear to be more adverse for Gurobi.

As for Ipopt, there seems to be (I assume) an additional judgement for inequality constraint, compared to the more straightforward equality constraints. Getting duals for nonlinear programming problem - #7 by WalterMadelim

The first image is talking about QPs with linear (affine) equality constraints. I agree that linear equality constraints (the only type of equality constraints that appear in convex problems, of course), are often simpler to handle numerically than inequality constraints, because linear equations are nice. I was referring to more general nonlinear equality constraints appearing in nonconvex problems.

The second screenshot is about proofs, which is an entirely different matter than algorithms.

2 Likes

Solving a sequence of QPs is the basis of Lagrange-Newton (SQP and barrier) methods for nonconvex optimization in which nonlinear constraints are linearized at the current point.
The “easy” case is a convex equality-constrained QP, because solving it is just about linear algebra: the necessary first-order (KKT) conditions are a set of linear equations.

A soon as we introduce inequality constraints (that includes bounds), things become much more complicated. The KKT conditions characterize stationary points with the knowledge of the active set (the set of active inequality constraints) at the solution, which is unknown a priori. More specifically, the complementarity equations are the source of combinatorial explosion (for each inequality constraint, we ask: is it active or inactive at the solution?).

Methods that handle inequality constraints include active-set methods, barrier (interior-point) methods, and projection methods. Essentially, they transform solving the inequality-constrained problem into solving a sequence of equality-constrained problems. In the nonconvex case, the majority of methods are infeasible: they do not require a feasible starting point and they reach feasibility only on the way towards the solution.

Obviously, things are different for zeroth-order method such as evolutionary algorithms, because the first-order characterization of stationary points isn’t relevant any more.

1 Like

Very nice info.

I’m currently not sure what is the most appropriate method to address the nonconvex quadratic problem—traditional NLP method or MILP method (e.g. Gurobi). My intuitive (and read from some books) says that continuous optimization is easier than discrete ones (i.e. MIP). But I find the performance of cutting edge MIP solvers very exceptional!! I suspect that the powerful MIP solvers will be promising in the future.