what can I do if the optimization results in “Converged to a point of local infeasibility.
Problem may be infeasible.” despite of changing the start point?
I am using JuMP+Ipopt
What is your objective function and what are your constraints? Do you have a MWE?
here is the whole problem from the original paper
I’m sorry what is MWE?!
It’s my first time solving such a problem
An MWE is a minimum working (or “workable”) example; so a code that contains the essential features of your issue, is runnable and reproduces the problem you are facing. See also here: Please read: make it easier to help you
Anyhow, in your case: have you started the optimization from a point that fulfills the constraints or not? If not, that would be my starting guess of what’s wrong. Or, of course, the problem may not be correctly specified (i.e. you are actually trying to solve something else than what you think)
good point!
I’ve just considered the boundaries and the second constraint.
is there a function or something else that could help me generate start values as so, or should I do it manually?
It’s easier to provide help if you supply the code instead of the LaTeX formulation.
Do you know your problem has a feasible solution?
There is no special function to generate feasible solution. One good approach is to comment out constraints and try re-solving the problem until you find a feasible solution. That can tell you which constraints are problematic.
Also, you c_j v_j
is non-convex. Ipopt assumes problems are convex, so you will only find a local minima, and Ipopt may struggle to deal with feasibility (like in this case).
Here’s the code for loading the data and other preparations (I’m not sure you need it or not):
using LinearAlgebra
using Pkg
using Random
s_matrix=CSV.File("stoichiometric matrix of iJO1366.csv");
s_matrix_df=DataFrames.DataFrame(s_matrix)
reaction_names=names(s_matrix_df)
data=CSV.File("list of metabolites added.csv")
data_df=DataFrame(data)
list_of_measured_rxns=CSV.File("measured fluxes.csv")
list_of_measured_rxns_df=DataFrame(list_of_measured_rxns)
indx_of_P=[findall(x->x==item,reaction_names) for item in list_of_measured_rxns_df[:,"rxn"]]
indx_of_P2=[item[1] for item in indx_of_P ]
glc_indx=findall(x->x=="GLCptspp",reaction_names )-[1]
uptake=9.654
all_indx=1:size(s_matrix_df,2)
the_rest_indices=setdiff(all_indx,indx_of_P)
the_rest_indices=setdiff(the_rest_indices,glc_indx)
the_rest_indices=setdiff(the_rest_indices,[1])
measured_fluxes=list_of_measured_rxns_df[:,"flux"]
and here is the formulation:
model=JuMP.Model(with_optimizer(Ipopt.Optimizer))
@variable(model,v[1:(size(s_matrix_df,2)-1)]>=0) # -1 is because the first column in metabolite names not a reaction
@variable(model,u[1:size(s_matrix_df,1)] )
@variable(model,c[1:size(indx_of_P,1)]>=0)
@variable(model,g)
@NLconstraint(model,sum(c[j] * v[indx_of_P2[j]] for j in 1:length(c)) ==uptake*g ) ###it's just for P #1+
@constraint(model,sum(c)==1) #+ #2 ++
@constraint(model, [dot(s_matrix_df[i,2:end],v) for i in 1:size(s_matrix_df,1)] .==0 ) #3++
@constraint(model,v[glc_indx].==[uptake]) #4 ++
@constraint(model,[dot(u,s_matrix_df[:,indx_of_P2[j]])-c[j] for j in 1:size(indx_of_P2,1) ].>=0) #5 ++
@constraint(model,[dot(u,s_matrix_df[:,j]) for j in the_rest_indices].>=0) #6 +
@constraint(model,dot(u,s_matrix_df[:,glc_indx[1]])+g>=0) #7 +
@NLobjective(model,Min,sum((v[indx_of_P2[j]]-measured_fluxes[j])^2 for j in 1:size(measured_fluxes,1)))
for i in 1:size(v,1)
set_start_value(v[i],rand(0:0.01:1000))
end
for i in 1:size(u,1)
set_start_value(u[i],rand(-1000:0.01:1000))
end
set_start_value(g,rand())
c_init=rand(size(c,1))
s_c=sum(c_init)
c_start=c_init/s_c
sum(c_start)==1
for i in 1:size(c,1)
set_start_value(c[i],c_start[i])
end
JuMP.optimize!(model)
(How can I load the CSV files here (if needed)?)
The problem was solved in the original paper(DOI: 10.1002/bit.10617), but with much less number of variables :“Problems
containing as many as 200 variables were solved in seconds
using MINOS 5.0 accessed via the GAMS modeling environment”
The problem which I’m trying to solve has about 4000 variables
And I’ve noticed the problem is not convex @odow , but I hoped changing the start point could help
Setting a random starting point is unlikely to help. You really need to compute an initial feasible solution.
As one example, v[glc_indx]
has to be uptake
in an optimal solution, but you set the start value to a random number.
You can also simplify a few of your constraints. No need to wrap things in []
and use .==
.
# Simplify
@constraint(model,[dot(u,s_matrix_df[:,j]) for j in the_rest_indices].>=0)
# to
@constraint(model, [j in the_rest_indices], u' * s_matrix_df[:, j] >= 0)
See these docs: Constraints · JuMP
While I’m thinking about this…
not related to your optimisation problem, but these types of lines can be compressed as:
s_matrix_df = CSV.read("stoichiometric matrix of iJO1366.csv",DataFrame)
I initially thought this looked like a KKT conditions of a bilevel program, and looks like that was accurate.
So Oscar’s suggestion is a good one, and the paper indicates that you could do this by making some different choices for the weights c_j
and see if you can at least solve the (Primal) lower level problem for those different choices of c
. Your existing random seed code should work fine :
You can use a linear programming solver to do this such as Clp, not Ipopt.
Note that in this approach, the c
’s become fixed values in the program, not variables.
If successfully able to solve the Primal for random c
’s, then you’ll be more confident the input data is basically sound and can focus on the formulation of the nonlinear version.
Coming after a long break…
I admit my code is not the best possible as solving this NLP is my first experience in julia and I formulated the problem as fast as I could to come to the solving challenges.
Anyway thanks for your guidance
I’m not familiar with KKT conditions…
If I’ve understood what you meant correctly the Primal was infeasible
I tried to solve this(the primal) with Clp
as you mentioned :
the result is :
As @odow suggested I also commented out the constraints, apparently constraint #3 which also exists in the primal is making the problem
What can I do now? It’s the most fundamental one.
apparently constraint #3
From my previous reply:
Do you know if your problem has a feasible solution?
I suggest you try manually constructing an initial feasible solution and provide that as a starting point. If you can’t construct an initial feasible solution, then there is probably something wrong with your modeling or data.