Matrix assignment problem: performance advice?

I have an assignment problem with the following aims:

  • there are nstudents, each of whom belongs to one of several different graduate programs. Each program has multiple students in it. (For a sense of scale, in our case nstudents = 120.)
  • there are nweeks of collaborative group-work; in each week there are several different groups
  • in each week, a student gets assigned to a single group (in our case, there are 2 or 3 weeks and 15-20 groups per week)
  • students assign a preference score to each group in each week
  • (same program penalty) we want to penalize (but not block entirely) cases where two students from the same program end up in the same group
  • (repeated pairing penalty) we want to penalize (but not block entirely) cases where two students are paired in a group for more than one week

I’ve thought about setting up the problem as follows: let A[i, j] ∈ (0, 1) indicate whether student i got assigned to group j (j being an index that wraps both weeks and group numbers into a single index). Then define the student-student correlation C = A*A', and let S[i1, i2] ∈ (0, 1) be 1 in cases where students i1 and i2 are in the same program. Then if means the Frobenius inner product, the objective might be written

P⊙A + αS⊙C + βC⊙C

where P is the preference matrix and α and β are scalar coefficients that express the magnitude of the same-program and repeated-pairing penalties, respectively. The reasoning behind the C⊙C term is that it’s small when the correlation is 0 or 1, and grows rapidly for correlations of 2 or more. In code, this can be expressed as

    @NLobjective(model, Min, sum(A[i, j] * P[i, j] for i = 1:nstudents, j = 1:totaloptions) +
                             penalty_sameprogram * sum(A[i1, j] * A[i2, j] * S[i1, i2] for i1 = 1:nstudents, i2 = i1+1:nstudents, j = 1:totaloptions) +
                             penalty_samepartner * sum(A[i1, j] * A[i2, j] for i1 = 1:nstudents, i2 = i1+1:nstudents, j = 1:totaloptions)^2)

where smaller P is better (we use 1=most-preferred, 4 = least-preferred) and both penalty coefficients are nonnegative.

I’m looking for general guidance about how to tackle this. I’ve been trying Juniper + Ipopt, but I’ve been dismayed by the performance. Even on a simple test case like this:

julia> students
6-element Vector{Student}:
 StudentA Last (Program1)
 StudentB Last (Program2)
 StudentC Last (Program3)
 StudentD Last (Program1)
 StudentE Last (Program2)
 StudentF Last (Program3)

    preferences = (
        [1 4;
         1 4;
         1 4;
         4 1;
         4 1;
         4 1],
        [1 3 4;
         1 2 4;
         1 4 3;
         1 3 4;
         1 2 4;
         1 4 3],
    )

I get

julia> @time assign!(students, preferences; penalty_sameprogram=5, penalty_samepartner=0.5)
  0.957838 seconds (1.00 M allocations: 116.754 MiB, 4.65% gc time)
6-element Vector{Student}:
 StudentA Last (Program1): [1, 1]
 StudentB Last (Program2): [1, 3]
 StudentC Last (Program3): [1, 1]
 StudentD Last (Program1): [2, 3]
 StudentE Last (Program2): [2, 2]
 StudentF Last (Program3): [2, 2]

That’s livable, but obviously it’s a toy problem. With 120 students and ~40 groups, there’s about a 1min pause before the first round of Ipopt optimization; hitting Ctrl-C reveals it’s inside acyclic_coloring. And this seems to repeat each time Juniper calls Ipopt for a new solution. It’s only when I drop down to about 30 students that the gap between successive Ipopt runs is on the scale of 1s, but because I’ve not yet waited for it to finish I’m unsure how many such runs there will be.

Any suggestions for tackling this?

Your description sounds very similar to Open-shop scheduling - Wikipedia, which is known NP-hard for your sizes. Your constraints aren’t quite as hard (you do allow having different groupings repeat), but it’s still very similar.

For that I’d tend to use Hungarian.jl (EDIT: although I guess that’s not quite the same, though…the multi-jobs-per-worker is distinctive). I think the distinctive thing here are the penalties (similar to constraints except won’t ever be infeasible) on students ending up in the same group.

How about this:

In the first week, you can greedily assign students to a group according to their preference, one program at a time, in a round robin fashion over all programs. Within a program & preference ranking, you can pick the student at random, though preferring students with higher preference for that group. Once a group is filled, eliminate the choice from the remaining students preferences. This ensures an even distribution of students from any given program to any given group, which should be pretty minimal (assuming you can make people go to groups they really dislike - which may be more of an administrative issue than a grouping one).

Its the second & third week where things get interesting, due to the additional constraint from the groups of the previous week coming into play. Conceptually, you can still do the same thing, except you change the students preferences, placing the group of the previous week at a random place within their preference of the new week. That ought to work for the third week as well, though maybe there’s an approach to further bias exclusions in the third week, based on the groupings in the first week as well.

It’s not going to be an optimal solution, in the sense that you won’t get the minimum number of possible repeated pairings, but it should account for your same program requirement and also okayish for repeated pairings. Its the low number of weeks you have to fill that are an advantage here, because it saves you from the limit case.

1 Like

In my mind this is classic application for Mixed Integer Programming. I would model this as mixed integer linear program in JuMP and solve it with HiGHS.

1 Like

I’d love that. I don’t currently see how to formulate same program penalty and repeated pairing penalty as a linear program? From what I can tell, same program penalty is quadratic, and I haven’t found a good way to implement repeated pairing penalty with less than 4th order. (I tried quadratic but it did not do what I wanted.)

A piecewise linear outer approximation of a convex penalty function maybe sufficient if you really want the behavior of a forth order penalty. However, my experience has been that linear or quadratic penalty functions with some carefully selected coefficients are suitable for my needs in a wide range of applications.

2 Likes

Since it is a binary program, the quadratic variables can be linearized exactly with the McCormick envelopes (whereas they usually provide a mere relaxation):

\begin{cases} x, y, z \in \{0, 1\} \\ z = xy \end{cases} \quad \iff \quad \begin{cases} x,y,z \in \{0, 1\} \\ z \leq x \\ z \leq y \\ z \geq x + y - 1 \end{cases}

I need to think a little more about the repeated pairing penalty, but in theory you can apply the same trick iteratively.

3 Likes

Let’s say your decision variable is x_{sgw} = 1 if student s gets assigned to group g on week w. Then your objective components would look as follows:

  • Preference scores: \sum_{s,g,w} p_{sgw} x_{sgw}
  • Same program penalty: \sum_{s_1 \sim s_2, g, w} x_{s_1 g w} x_{s_2 g w} (not sure if you count each week for this one)
  • Repeated pairing penalty: \sum_{s_1, s_2, g} \left(\sum_w x_{s_1 g w} x_{s_2 g w} - 1\right)_+

Does that sound about right? If so, the positive part a_+ = \max(a, 0) can also be linearized

1 Like

Not the quadratic penalty, and I haven’t tested this, so there might be typos etc. But I’d model it along the lines of:

using JuMP, HiGHS

n_students = 120
n_weeks = 3
n_groups = 20
programs = ["a", "b"]
preference = rand(n_student, n_weeks, n_groups)
program_to_students = Dict(
    "a" => [1, 2, 3],
    "b" => [4, 5, 6],
)

model = Model(HiGHS.Optimizer)
@variables(model, begin
    x[1:n_students, 1:n_weeks, 1:n_groups], Bin
    same_program_penalty[programs, 1:n_weeks, 1:n_groups] >= 0
    same_pairing[i=1:n_students, (i+1):n_students, 1:n_weeks, 1:n_groups], Bin
    repeat_pairing_penalty[i=1:n_students, (i+1):n_students] >= 0
end)
@objective(
    model, 
    Max, 
    # Maximise student preferences
    sum(preference[i, j, k] * x[i, j, k] for i in 1:n_students, j in 1:n_weeks, k in 1:n_groups) -
    # Penalties
    100 * sum(same_program_penalty) - 100 * sum(repeat_pairing_penalty),
)
@constraints(model, begin
    # Each student assigned to one group each week
    [i=1:n_students, j=1:n_weeks], sum(x[i, j, :]) == 1
    # Minimum and maximum number of students per group
    [j=1:n_weeks, k=1:n_groups], sum(x[:, j, k]) <= ceil(Int, n_students / n_groups)
    [j=1:n_weeks, k=1:n_groups], sum(x[:, j, k]) >= floor(Int, n_students / n_groups)
    # Penalize if more than two students from program p are in the same group
    [j=1:n_weeks, k=1:n_groups, p=programs],
        sum(x[i, j, k] for i in program_to_students[p]) - same_program_penalty[p, j, k] <= 1
    # Compute if two students are in the same group
    [i1=1:n_students, i2=(i1+1):n_students, j=1:n_weeks, k=1:n_groups],
        x[i1, j, k] + x[i2, j, k] <= 1 + same_pairing[i1, i2, j, k]
    # Repeat pairing penalty
    [i1=1:n_students, i2=(i1+1):n_students],
        sum(same_pairing[i1, i2, j, k] for j in 1:n_weeks, k in 1:n_groups) <= 1 + repeat_pairing_penalty[i1, i2]
end)

(This problem is actually near and dear to me. @MFairley and I wrote something similar, that has been used to schedule O(600) students into groups every year for the last 10 years) : http://orsnz.org.nz/conf48/program/Papers/nzsaorsnz2014_paper_48.pdf, GitHub - odow/group-allocator: Allocate people into groups with balanced characteristics using Excel)

8 Likes

I’m really blown away by the awesome suggestions, and from them I learned a ton that I imagine I’ll apply to other problems too.

I’ve put the corresponding repository up on GitHub: GitHub - timholy/AssignGroups.jl: Assign students to collaborative groups. Perhaps it will be useful in some way to others. It’s only a slight change from the suggestion by @odow. Performance is perfectly adequate for our needs now. Many, many thanks!

5 Likes