[Wall of text incoming, beware!]
I’m a Julia novice and have been playing around with JuMP over the last month or so. All my small scale test models worked beautifully, so I finally decided to dive into the deep end of the pool and try a full scale model.
So I spent last week translating a global energy system model from GAMS to JuMP. The model has roughly 1500 lines of code and reads input data from three external spreadsheets. It generates a pure LP problem with 7-30 million equations (depending on parameterization) and roughly the same number of variables. But it’s just an LP so CPLEX solves it easily in about 30 seconds on a standard quad-core desktop. That’s the smallest version of the model, larger versions may need up to a few hours.
The JuMP version works, that’s the good news. However, it’s disappointingly slow: GAMS generates the model and sends it to CPLEX in 10 seconds, but my JuMP implementation needs 220 seconds. Almost all this time is spent generating the model constraints. JuMP also eats about twice the memory (but my computer has plenty of RAM so there’s no slowdown due to memory swapping). Interestingly, model generation time doesn’t scale much with model size: a larger version of my model with twice the rows and twice the columns generates in 260 seconds.
I’ve seen a few benchmarks that suggest that JuMP should be roughly as fast as GAMS, so I suppose my implementation is to blame for these performance issues. I hope I can get some pointers on how to speed things up. Here’s a summary of what I’ve done.
Most “sets” of the model (using GAMS terminology) are vectors of symbols. The only exception is the time
set, which is a vector of integers. Examples:
primary = [:bio1, :bio2, :hydro, :wind, :solar, :gas1, :gas2, :oil1, :oil2, :coal1, :coal2, :uranium1, :uranium2]
sector = [:elec, :central_heat, :dist_heat, :solid_heat, :non_solid_heat, :feedstock, :transport]
region = [:EUR, :NAM, :PAO, :FSU, :LAM, :CPA, :AFR, :SAS, :PAS, :MEA]
time = collect(2010:10:2150)
Model “parameters” are NamedArrays of Float64, indexed by the sets above. I understand that I am giving up some performance by not using ordinary arrays, but it is important for readability to be able to write demand[:elec,:EUR,t]
instead of demand[indexof(sector,:elec), indexof(region,:EUR), t]
. The latter gets extremely messy when you have lots of that stuff in the same equation. Also, modeling languages like GAMS and Ampl provide indexing by name, so the comparison is fair.
And here are a few examples of model constraints. Convention: model variables begin with a capital letter and parameters begin with a lower case letter.
@constraint(m, supply_c[i in energy_in, r in region, t in time],
Supply[i,r,t] - En_export[i,r,t] + En_import[i,r,t] == sum(En_conv[i,o,r,t] for o in energy_out)
)
@constraint(m, capacity_c[i in energy_in, o in energy_out, r in region, t in time; o != :elec],
En_conv[i,o,r,t] * effic[i,o,r,t] <= Capacity[i,o,r,t] * cf[i,o] * 8760*3600/1e6
)
@constraint(m, market_growth_c[i in energy_in, o in energy_out, r in region, t in time; t > t0],
Cap_invest[i,o,r,t] <= Cap_invest[i,o,r,t-t_step] * (1+maxgrowth)^t_step + seed_capacity
)
@constraint(m, cost_capacity_c[r in region, t in time],
Cost_capacity[r,t] == sum(Cap_invest[i,o,r,t] * invest_cost[i,o,t] for i in energy_in, o in energy_out)
)
Incidentally, I would much prefer to use the @constraints m begin ... end
syntax since the list of constraints would become much cleaner without the repeated @constraint
macros and additional parentheses. However, when debugging the model, all errors were reported with the line number of the @constraints
statement. I had to switch to the uglier syntax to get the correct line number of the error.
I initially had the entire model in the global scope, but split up into a number of smaller files using include()
statements. I gained a bit of speed when I put everything in the same file and wrapped it all in a function. However, that means I now have an unmanageable file with 1500+ lines. Is there a good way to split it up again but still have everything in the same non-global scope? Using include()
doesn’t work inside a function.
Thanks in advance for any help!