JuMP: many small problems or one combined problem?

I am solving many independent assignment/matching problems using JuMP. Since the problems are independent, it is theoretically no different if I solve each of them separately or if I solve one big combined problem. However, I am wondering about the computational efficiency of each approach? Solving many problems incurs lots of model building & setup costs, but solutions to larger problems often start taking relatively more time as the problem size grows beyond some threshold.

Here is an example of one of these small problems:

using JuMP, HiGHS

qtyowned = [400, 400, 100, 100]
val = [10, 10, 10, 10.50]
qtysold = [200, 200, 200, 200, 100, 100]
saleprc = [10, 10, 10, 10, 10, 10.50]

N = size(qtysold,1)
K = size(qtyowned,1)

model = Model(HiGHS.Optimizer)
@variable(model, x[i=1:N, k=1:K], Bin)
@constraint(model, [i=1:N], sum(x[i,:]) == 1)
@constraint(model, [k=1:K], sum(x[j,k] * qtysold[j] for j in 1:N) == qtyowned[k])
@constraint(model, [k=1:K], sum(x[j,k] * qtysold[j] * saleprc[j] for j in 1:N) == qtyowned[k] * val[k])
round.(Int, value.(x))

NB: sometimes I add a quadratic objective for aligning timestamps as well, if that matters at all.

I can stack a bunch of these problems together by either making x really big and adding another eligibility constraint or by looping over this and creating lots of subproblems in one model. Usually the dimension is small but sometimes N and K can be as large as 50 or 100 for a single problem. Would it be faster to keep them separate or combine them?

Also, separately, the first constraint is equivalent to SOS1(), correct? But I haven’t found a solver that supports that constraint. Would that be faster?

You could build one model and then modify its coefficients iteratively? If all your problems have the same size, this would definitely be less memory-hungry, even in the scenario where the solver recognizes the independence of the subproblems. On the other hand, it would prevent parallelism.

1 Like

They are not all the same size. Probably most are approx. the size of the MWE but some can be an order of magnitude larger. Right now I do sort of like a hybrid + batched approach where I combine some of the problems together (say, 500 individual problems becomes 16 larger problems) and then I solve each on a single thread.

I am not sure if I understand your question correctly, but I would expect that solving the problems separately is faster (including time for building the models). Assuming that you have to solve k problems with n binary variables:

  • The worst-case size of the branch-and-bound tree for solving one problem is 2^n

  • If you solve them separately you would have k * 2^n

  • If you combine the problems into one large one, you would have 2^(kn)

1 Like

I mean that the set of {qtyowned,val,qtysold,saleprc} changes, but that the JuMP model always looks like that one.

If I have P such problems (assume they are all the same size N by K), I can combine them in 2 ways. One way, I can make x[i=1:N*P,k=1:K*P] and introduce an additional indicator constraint x <= eligible where eligible is a Boolean array with the same size as x that indicates the grouping structure. If there are lots of groups, this becomes a big problem. I think I found this to be a bad option.

The second way is that I could loop over what is essentially the JuMP @variable and @constraint lines in the MWE (attaching them to the same model), except it requires some clever work in being able to reference the variables.

Anyways, I think that what you said next

is correct. I left the model code quite simple, but put in some effort to be able to efficiently parallelize across the many problems (outside of JuMP) and it seems to be faster than before.

1 Like