I want to solve a linear optimization problem with quadratic constraints of the form \max_{\boldsymbol x} y \text{ such that } f_i(y \boldsymbol x) \leq 0 \: \forall \: i = 1, \dots, C
and the f_i are quadratic in the argument, i.e., it is of form f_i(y \boldsymbol x) = \alpha_i + \sum_{j=1}^N \sum_{k=1}^N y \beta_{kl} x_j x_k.
Since the number of constraints C is in the order of thousands, I want to solve this problem rather efficiently. I tried using Convex.jl and a certain optimizer, but the equivalent cone optimization problem that Convex.jl generates is usually significantly larger in the sense that a fistful optimization variables get blown up to multiple thousand ones.
I think now about using JuMP.jl but I am not sure if it supports arbitrary precision (through the use of BigFloat, preferably).
But note that your problem isn’t linear, or quadratic. It seems to have cubic terms? That means it’s nonconvex and nonlinear. You could use Ipopt though and see what sort of solution you get.
Yeah you are right, as written now it’s cubic. I could reformulate the problem maybe as \max_{\boldsymbol x} \Vert \boldsymbol x \Vert \text{ such that } f_i(y \boldsymbol x) \leq 0 \: \forall \: 1 , \dots, C \text{ for } y \text{ given } . The important thing is that the \boldsymbol x satisfies the constraints, the actual objective is not that important.
The constraints are poorly conditioned and solving the problem on Matlab with SDPT3 has shown that with default precision, not more than ~14 variables may be optimized at once, which is much less then I want to optimize.
This is why considered julia (besides Mathematica) since it offers (in theory) arbitrary precision arithmetics.
How did you solve it with SDPT3? The problem isn’t convex? Did you fix y, and then solve a SDP relaxation? That would cause numerical issues.
It might help if you could provide a small set of data to test with. What is N (in practice)? What is α? What is β? Why doesn’t the summation in f depend on i? What is l?
The way it is done in Matlab with CVX/SDPT3 is to formulate the problem as \min_{\boldsymbol x} \max_i \vert g_i(\boldsymbol x) \vert where the g_i are linear but complex in (\boldsymbol x) . Since my desired outcome is that \max_i \vert g_i(\boldsymbol x) \vert \leq 1, I thought about re-formulating this as \max_i \vert g_i(\boldsymbol x) \vert^2 \leq 1 \Leftrightarrow \vert g_i(\boldsymbol x) \vert^2 \leq 1 \: \forall i with f_i basically being the \vert g_i(\boldsymbol x) \vert^2.
If it turns out you don’t need arbitrary precision after all then you might want to try Gurobi. The latest version can handle large nonconvex quadratic problems. It may be helpful even if your problem is too large to solve to the global optimum before the end of the universe because it provides bounds on the global optimum. This can help you evaluate local optima from IPOPT or Gurobi’s own solve sequence.
Here’s a presentation that explains how the algorithm works:
\max_{\boldsymbol x} y \text{ such that } \max_i \vert g_i(y \boldsymbol x) \vert \leq 1
where y \in \mathbb R^+, \boldsymbol x \in \mathbb R, g_i: \mathbb R \mapsto \mathbb C and g_i are linear in \boldsymbol x.
For fixed y, this is a convex problem in \boldsymbol x which I solve with CVX.
To optimize y, I apply bisection in an outer loop since the g_i are essentially complex-valued polynomials with coefficients \boldsymbol x, i.e., \max_i \vert g_i(y \boldsymbol x) \vert is expexted to decrease if y decreases. Thus, a bisection method is applied to determine y. Details can be found in this paper.
Recall that the g_i(\boldsymbol x) map to the complex numbers \mathbb C and thus \vert g_i(\boldsymbol x)\vert \leq 1 is not equivalent to -1 \leq g_i(\boldsymbol x) \leq 1.
I could try to formulate this condition using polar numbers, maybe I will look into that.