Dear all,
I could use some good help from the experts here. I was about to translate an optimization model from Python/Pyomo to Julia/JuMP, expecting that Julia/JuMP would outperform Python’s Pyomo. However, model generation in JuMP took an unexpectedly long time.
So I came up with a small performance experiment to compare model generation time in Pyomo and JuMP that I want to share with you and get advice on what I can improve.
Setting:
Windows 11 Intel Core i7-1260P CPU @ 2.1 GHz 48GB RAM
Julia 1.8.3
JuMP 1.5.0
Python 3.8.15
Pyomo 6.4.4
I want to generate the following model with only a single constraint and a constant objective function:
$$\sum_{(i,j,k) \in \mathcal{IJK}} \sum_{(j,k,l) \in \mathcal{JKL}} \sum_{(k,l,m) \in \mathcal{KLM}} x_{i,j,k,l,m} \ge 0 ; \forall ; i \in \mathcal{I}$$
(sorry - only allowed to post one media item)
I generated the data used in the experiment in Python as follows:
def create_random_data(n, m=20):
I = [f'i{x}' for x in range(1, n + 1)]
J = [f'j{x}' for x in range(1, m + 1)]
K = [f'k{x}' for x in range(1, m + 1)]
L = [f'l{x}' for x in range(1, m + 1)]
M = [f'm{x}' for x in range(1, m + 1)]
IJK= pd.DataFrame(np.random.binomial(1, 0.05, size=(len(I)*len(J)*len(K))),
index=pd.MultiIndex.from_product([I,J,K], names=['i', 'j', 'k']),
columns=['value']).reset_index()
JKL= pd.DataFrame(np.random.binomial(1, 0.05, size=(len(J)*len(K)*len(L))),
index=pd.MultiIndex.from_product([J,K,L], names=['j', 'k', 'l']),
columns=['value']).reset_index()
KLM= pd.DataFrame(np.random.binomial(1, 0.05, size=(len(K)*len(L)*len(M))),
index=pd.MultiIndex.from_product([K,L,M], names=['k', 'l', 'm']),
columns=['value']).reset_index()
ijk = [tuple(x) for x in IJK.loc[IJK['value'] == 1][['i', 'j', 'k']].to_dict('split')['data']]
jkl = [tuple(x) for x in JKL.loc[JKL['value'] == 1][['j', 'k', 'l']].to_dict('split')['data']]
klm = [tuple(x) for x in KLM.loc[KLM['value'] == 1][['k', 'l', 'm']].to_dict('split')['data']]
return I, J, K, L, M, ijk, jkl, klm
The Julia model implementation:
using JuMP
using BenchmarkTools
function model_ei(I, IJK, JKL, KLM)
model = Model()
x_list = [
(i,j,k,l,m)
for (i,j,k) in IJK
for (jj,kk,l) in JKL if jj == j && kk == k
for (kkk,ll,m) in KLM if kkk == k && ll == l
]
@variable(model, x[x_list] >= 0)
@constraint(model, [i in I], sum(
x[(i,j,k,l,m)]
for (ii,j,k) in IJK if ii == i
for (jj,kk,l) in JKL if jj == j && kk == k
for (kkk,ll,m) in KLM if kkk == k && ll == l
) >= 0
)
end
N = [n for n in 50:50:2000]
for n in N
I, J, K, L, M, IJK, JKL, KLM = read_data(n)
tmi = @belapsed model_ei($I, $IJK, $JKL, $KLM) samples=7 evals=10
end
I, J, K, L, M, IJK, JKL, KLM are all vectors.
The Pyomo model implementation:
import pandas as pd
import numpy as np
import pyomo.environ as pyo
def pyomo_mi(i, ijk, jkl, klm):
model = pyo.ConcreteModel()
model.IJK = pyo.Set(initialize=ijk)
model.JKL = pyo.Set(initialize=jkl)
model.KLM = pyo.Set(initialize=klm)
model.z = pyo.Param(default=1)
model.x = pyo.Var([(i,j,k,l,m) for (i,j,k) in ijk
for (jj,kk,l) in jkl if (jj == j) and (kk == k)
for (kkk,ll,m) in klm if (kkk == k) and (ll == l)],
domain=pyo.NonNegativeReals)
model.OBJ = pyo.Objective(expr=model.z)
model.ei = pyo.Constraint(i, rule=ei_rule)
for n in N:
I, J, K, L, M, IJK, JKL, KLM = read_data(n)
tmi = %timeit -o -r 7 -n 10 pyomo_mi(I, IJK, JKL, KLM)
I, J, K, L, M, IJK, JKL, KLM are all list of tuples.
The result for an increasing cardinality of IJK with constant cardinality for J, K, L, M, JKL, KLM:
Can anybody explain why Julia/Jump performs so much worse than Python/Pyomo? Or tell me what I am doing wrong?
Best
Justine