1e6+ constrained optimization problems

I am working on a problem in medical imaging where I have to solve an optimization problem at every voxel (read: pixel) that is not background. The optimization problem is relatively simple, it is a system of three non-linear—basically polynomial—equations with three unknowns where the values are bounded in certain (fairly large) ranges. The problem is that there are one million+ voxels need to be solved for.

Previously I was using python’s scipy implementation of least squares to solve the problem at every voxel; however, even with parallelization this takes hours. I was hoping to speed things up by changing to julia where there isn’t so much overhead with for loops.

I looked in two directions: JuMP and NLsolve. JuMP appears to be slightly difficult to update the constraints necessary to solve the system of equations, it looks like I would use the fix function to change constant values (corresponding to voxel intensities) in the constraints. I would appear that there would be considerable overhead in this as well as the call tto the external solver. I may be completely wrong though.

The other direction was NLsolve, but NLsolve’s mcpsolve does not enforce constraints (see here). It is really fast, but the values stray outside of the bounds in my equations resulting in non-sense, unfortunately.

Does anyone have any idea about a good way to formulate the problem such that there is a reasonably computationally efficient solution? Or is this all totally misguided?

Thanks for any help!

I think you might be able to use Parametron to efficiently solve many instances of the same problem(but different data) in a loop.

Can you tell me a little bit more about your optimization problem? Maybe the problem can be reformulated such that you can use HomotopyContiuation.jl. Happy to help if you are interested.

It’s not super clear to me what the exact problem is. Is it a non-linear least squares problem, or a system of non-linear equations?

1 Like

Thank you all very much for the responses and recommendations. I will look into all of the recommended packages, but I’ll also give a little bit of background about what I am trying to do, since it appears it will be helpful (and because I don’t really know what I’m doing insofar as optimization :sweat_smile:).

So I have three magnetic resonance (MR) images: X, Y, Z. They come from three different pulse sequences which interact with NMR relaxation parameters in known ways. That is, we have equations which given the underlying NMR parameters (specifically, T1, T2 and PD) and the scanner parameters p^X, p^Y, p^Z corresponding to each image X,Y,Z, we can model the output intensity value observed in the acquired image at some voxel [(i, j, k), e.g., X_{i,j,k}, since we are in 3D].

However, we don’t know the T1, T2 and PD parameters but we do have X,Y,Z and we can separately solve for each p^{(\cdot)}. If you haven’t followed any of the previous mediocre explanation, then the problem simply reduces to the following system of equations, for which I am trying to solve T1, T2, and PD where everything else is given.

log(X_{i,j,k}) = log(PD) + p_0^X + p_1^X T1 + p_2^X T1^2
log(Y_{i,j,k}) = log(PD) + p_0^Y + p_1^Y T1 + p_2^Y / T2
log(Z_{i,j,k}) = log(PD) + p_0^Z + p_1^Z T1 + p_2^Z / T2

(Apologies for the formatting). In previous iterations of this project, the person who built the code used non-linear least squares (in MATLAB, FWIW) to solve for the values of T1, T2 and PD. However, the person who built the code left our lab, and with that has gone the knowledge of why that was chosen. I do know they tried several solvers before choosing nonlinear least squares, although the equations were much more involved when they first developed the system.

I implemented the above in python here and it works on the one data set I tried it on. However—as previously stated—it takes several hours and I want to run this on hundreds of data sets, if possible.

Hopefully this helps in understanding where I am coming from. I am writing this quickly as I am traveling today.

Thanks for all the help!

1 Like

Something seems off with the equations, the second and third equation imply Y{i,j,k}=Z{i,jk} (if I interpret p_2 as p2). Or do I miss something?

Parametron can only handle up to MIQP at this point, and there are currently no plans to handle nonlinear programs (I’m the main author).

FYI, you can just use LaTeX math on Discourse using dollar signs, e.g. $X_{i, j, k}$ becomes X_{i, j, k}.

1 Like

I forgot to add the proper notation. I reformatted the equations using the latex commands (thanks for pointing that out @tkoolen). Let me know if something is still ambiguous. Thanks!

I made a package StaticOptim ( GitHub - aaowens/StaticOptim.jl ) for the purpose of solving millions of cheap non-linear optimization problems quickly, but it doesn’t support constraints. Maybe you could take care of bound constraints manually? If you solve the unconstrained problem and the solution is out of bounds, you can resolve the problem with the constraint binding (check the KKT conditions).

Maybe I should add that functionality, it doesn’t sound very hard.

2 Likes

I created a Jupyter notebook which shows you how to solve this with HomotopyContinuation.jl:

Hope this helps! Let me know if this is helpful for you. Also would be cool if you could point me to some real world data :slight_smile:

4 Likes

Okay, so the constraints are just bounds. LsqFit supports this. Are the equations actually a system of polynomials, or is this a simplified version of the problem?

Wow, great work @saschatimme ! I am sure this will help. I won’t be able to look at this much today, but I’ll be able to do some more exploration with this tomorrow. I will also check out the StaticOptim package @aaowens and report back my findings on both fronts. @pkofod I’ll definitely use LsqFit, at least as a baseline, so that I can compare the output to the python version. The reason I didn’t initially use LsqFit was that it used the Levenberg-Marquardt method and scipy’s version does not allow bounds, so I mistakenly thought that bounds would not be supported with LsqFit or that there would be an issue with the bounds. Thanks for pointing that out!

Regarding the data, unfortunately, since this is medical data of real patients it is infeasible for me to share the data. However, there are a couple locations of publicly available data that would be of immediate use and can be used with this application, here are the links:

It will probably not be clear how to use these data sets with this application, but I’d definitely be willing to help anyone trying to do something with them. FWIW, I list some more data sets here and methods to preprocess them for analysis (you can disregard the final part concerning deep learning applications as everything else is still relevant).

Thanks again for all the information. This was incredibly useful :+1:

@pkofod these are incredibly simplified equations that only roughly estimate the true relationship. However, the model is useful so this is an ok starting point.

Sure, but if the real model is not three polynomials, it matters for the solution you invest your time in implementing. So it may be that a polynomial solvers (as above) works great now, but then suddenly you add +log(parameter) and all that code now has to be written all over in another system. You still have that base line of course, and you can use that to compare to other nonlinear solvers, but it won’t generalize to “uglier” problems. Just a thing to keep in mind!

1 Like

Hi all,

let me add some lines to the discussion:

In my opinion, if you want to solve a set of equations, and you can formulate them as polynomial equations, you should ALWAYS solve the equations and not use optimization algorithms. Even, if you have logs( ) or exps( ) often you can add some artificial variables to get polynomial equations (for instance, if you have sin(x) and cos(x) add c and s with the additional equation c^2+s^2=1).

Let me explain this, because it goes a bit contrary to common knowledge: say the system is f=0. Then, the optimization problem is min |f(x)| plus constraints. It gives the gradient equation Jf(x)^Tf(x)=0. This equation usually has much more solutions than f = 0 (namely, all the local extrema). Hence, for finding the actual minimum one has to do a lot more work. On the other hand, solving f = 0, even without constraints, is much cheaper. And: one can check constraints a posteriori.

4 Likes

Totally agree! If you can transform your problem into another problem with robust solution methods, you should. However, it’s still not clear to me what OP actually wants to solve once he turns to his “advanced model”. In the end, there are only so many tricks and transformations you can do, and you might have to turn to local, general purpose nonlinear solvers in the end.

The reason why I took the bait on least squares solving was that that was used in the reference code, so I thought it might be a non-linear least squares problem they actually wanted to solve, but it appears that the previous choice to use scipy’s minpack wrapper was simply that you can in principle solve the system of equations by minimizing a sum of squares (and hoping that you are actually able to find a zero-residual solution).

1 Like

I added bound constrained optimization to StaticOptim master using projected newton. It’s still experimental but can you try it from the master branch? The syntax is

using StaticOptim
# Make sure sx, l, and u are SVectors from StaticArrays
res = constrained_soptimize(f, sx, lower = l, upper = u)

and there should be minimal overhead. It uses ForwardDiff so you’ll need to make sure your function is compatible with abstract number types, but I doubt your functions would have any problem.

1 Like

Hi, maybe some unpopular opinion here. Did you try to use Numba in the minimizer functions F, dF and H? I get speedups of a factor \times 15 in all of them. This is just an easy to implement/intermediate solution, because it seems that only Julia will be much faster, but in the meantime it may help a bit and you can still maintain your Python code.