I’ve been struggling for a couple of days now to coerce JuMP to solve a particular problem. The goal is fairly simple and I’ll describe a simplified version of my real problem here, along with the code:

Given the latitudes, longitudes, and populations of 10 cities, I need to divide the cities into 3 groups in such a way that the geographical distance between all cities in a group is minimized, and I also need the total populations of each of the three groups to be as close as possible.

I’ve tried this with GLPK, Ipopt, and Juniper and I can’t get any of them to work so I’m hoping someone with more knowledge/experience than me can tell me which solver is actually capable of handling this problem.

```
using DataFrames
using Distances
using GLPK
using Ipopt
using JuMP
using Juniper
using LinearAlgebra
using Random
using Statistics
# DataFrame with 10 cities including their lat/lon and population
cities = DataFrame(
city=[
"Marysville",
"Perris",
"Cleveland",
"Worcester",
"Columbia",
"Waterbury",
"Eagan",
"Southfield",
"Lafayette",
"Boise"
],
lat=[
48.0517637,
33.7825194,
41.49932,
42.2625932,
34.0007104,
41.5581525,
44.8041322,
42.4733688,
30.2240897,
43.6187102
],
lon=[
-122.1770818,
-117.2286478,
-81.6943605,
-71.8022934,
-81.0348144,
-73.0514965,
-93.1668858,
-83.2218731,
-92.0198427,
-116.2146068
],
pop=[
63269,
72326,
390113,
182544,
133358,
109676,
65453,
73006,
124276,
214237
]
)
num_groups = 3
# Compute the pairwise distances between each city
dm = Distances.pairwise(Haversine(6372.8), Matrix(cities[:, [2,3]])', dims=2)
# Assign random group numbers to start
groups = rand(1:num_groups, size(cities,1))
# Filter the distance matrix by group, compute the norm, then take the sum of each group's norm
ϵ = sum([norm(dm[groups .== i, groups .== i]) for i in 1:num_groups])
# Take the norm of each group's population - total population / num_groups (target pop per group)
σ = norm([sum(cities.pop[groups .== i]) for i in 1:num_groups] .- sum(cities.pop) / num_groups)
```

My approach thus far has been to create decision variables for each of the 10 cities `@variable(model, x[1:10], Int)`

and restrict their value to the integers 1, 2, and 3. The objective function is to minimize `σ + ϵ`

. The only additional constraint I have is that I must indeed end up with 3 groups but I’m not attempting to restrict the minimum number of cities per group at this point in time.

Will one of the available open-source solvers be able to handle this problem? If not, is there a better way that I can reformulate it?

I’m thinking about creating 3 vectors of length 10 and each value in the vector would be binary and then maybe there’s some way to solve the problem like that?