Hi everyone. I have a broad question, rather than a question about any specific package or code block, so please let me know if its inappropriate and I’ll move/close it.
I’m interested if the Julia/JuMP community has published/learned any best practices for setting up large models for optimization with a naturally hierarchical structure. I’ve searched around for examples to learn best practices from and came up with a somewhat overwhelming list of packages which set up large JuMP models, all having to do with power infrastructure (the specific problems I am interested in are not related to power generation/distribution, otherwise I’d just use one of these):
Are there any other packages I missed that the community would recommend as a good example of “best practices” to build large hierarchical problems in JuMP? And are any of the examples here exceptionally worth taking a look at the design to incorporate best practices? Are there any other references I should be checking out related to setting up large problems like this (I’ve read the “design patterns for larger models” page of the JuMP manual).
I realize I mixed a few packages that set up stochastic problems, relying on SDDP. In general I’m interested in both approaches, although mainly starting with the deterministic setting to begin with.
That list is fairly comprehensive. The NREL/SIIP PowerSystems.jl is probably the largest example with the most engineering time put into it. They went full modularity, and full performance (sometimes at the expense of some complicated code to pre-allocate JuMP containers).
The power-focused applications are also somewhat niche, because they have some unique features that aren’t in other applications (like a transmission graph, or multiple time structures, and very flexible generator combinations with different technologies all producing the same commodity).
I think the most important thing to start with is an understanding of the data. Ultimately, the JuMP optimization part will be a very small component of the overall application.
What at the the inputs and outputs at each layer of the hierarchy?
What format should the inputs and outputs be?
How can you setup, test, and validate each layer of the hierarchy independently?
I’d also advise that unless necessary, you avoid fancy tricks. Do the simplest thing possible, even if it is slightly slower than a more clever thing.
As a cautionary tale of something that works, but which I don’t think we got right (and since I’m somewhat responsible), take a look at GitHub - EPOC-NZ/JADE.jl.
The backstory is the Julia code was a 2016 re-implementation of a C++ application from 2011, which was a re-implementation of a series of AMPL scripts from 2008. And then between 2016 and now, a bunch of things got added to it in various places. (And during that time, Julia and JuMP and SDDP.jl all changed quite a lot.). So at no point did we ever step back and design a nice representation of the data layer.
I guess my point is ignore the JuMP part to start with. If you have good data representation, the JuMP part will be clean and simple. If you have a nice JuMP model and messy data, the entire thing will be a mess.
With regards to the point of not being able to test each piece in isolation in JADE.jl, would you have preferred to set up the model/data input such that you could create SDDP/JuMP models for “parts” of the overall model and test them individually regarding constraints that link them to other bits of the overall model as input? For my problem, I’m really interested in being able to test in this way.
would you have preferred to set up the model/data input such that you could create SDDP/JuMP models for “parts” of the overall model and test them individually regarding constraints that link them to other bits of the overall model as input?
but I don’t know if I have a good answer for exactly what that looks like. It’s something that I’d like to address once the nonlinear rewrite is finished.
Potentially something like:
using JuMP
abstract type AbstractConstraint end
struct Data
constraint_types::Vector{AbstractConstraint}
end
function initialize_model(data)
return Model()
end
function add_variables(model, data)
@variable(model, x)
end
struct ConstraintA <: AbstractConstraint end
function add_constraint(model, data, ::ConstraintA)
@constraint(model, model[:x] <= 1)
end
function test_ConstraintA()
data = Data([ConstraintA()])
model = initialize_model(data)
add_variables(model, data)
add_constraint(model, data, ConstraintA())
# https://jump.dev/JuMP.jl/dev/reference/solutions/#JuMP.primal_feasibility_report
@test primal_feasibility_report(model, Dict(model[:x] => 0.0)) === nothing
@test primal_feasibility_report(model, Dict(model[:x] => 1.0)) === nothing
@test primal_feasibility_report(model, Dict(model[:x] => 1.01)) !== nothing
@test primal_feasibility_report(model, Dict(model[:x] => 2.0)) !== nothing
return
end
function main(data)
model = initialize_model(data)
add_variables(model, data)
for constraint in data.constraint_types
add_constraint(model, data, constraint)
end
return model
end
But ideally simpler, so that you build the entire model and then pass good/pass points for each constraint.
For smaller applications, this might be too much complication. Everything’s a trade-off. Getting the input data into the right format and validating it is more important than the JuMP details though.
@odow I hope you don’t mind me bumping this old conversation. I was curious if after some time there were any developments regarding your thoughts on the modular construction of things (especially since the nonlinear rewrite ended up finished).
After a year of experience in this (at that time) new field to me, I completely agree that the data modeling stage is often more important than the details of building the optimization model.
That being said, as you may have noticed from my response in another thread (linked below), I’m very interested in the potential of the C-Set data structure to bring the data model and the optimization model closer together. Hence my interest to understand in a bit more detail the thoughts you sketched out in your last reply here, I’m interested to see if this approach can also have benefits for this modular construction/testing of the optimization model.
I think the design patterns tutorial is still the best reference.
Re the nonlinear: there wasn’t anything specific to that that i had in mind. Just that i was busy with that development and didn’t have time for other things.
There are two open issues you might be interested in
Thanks! Yeah, regarding the nonlinear rewrite, I just meant to ask if the resolution of that work led to any more time to think about this stuff.
Thanks for the issue links, the one regarding namespaces is quite interesting. I have a few questions:
Has there been any examples people have come up with (or questions you have seen here) where the current syntax is limiting and the pyomo style blocks would help?
I’m aware this is all very conceptual at the moment, but if a constraint that included variables in 2 different blocks was used, where would it be stored?
Nope. I’m very much open to suggestions! It might be a chicken and egg situation. JuMP models are built relatively “flat” because we have only a single common namespace. If we had nested namespaces, people might write their models differently… I think it’s more a thing of: blocks seem to work quite well for Pyomo. Perhaps they could work well for JuMP. Rather than a pressing need that people in the community are asking for.
In a common parent block of the two variables (however high up that might be)
Nope. That’s really about efficient ways to write summations over sparse index sets.
Regarding blocks, I am certainly interested in namespacing variables. The AlgebraicJulia people have needed this capability (part1.x, part2.x, etc) for multi-physics modeling before. I’ll have to take some time to understand what they’ve done to see if its relevant. In the meantime I can watch the pyomo talk from JuMP con and see if I can somehow get a copy of this chapter Structured Modeling with Blocks | SpringerLink to see how these are used in practice.
The support for relational algebra is interesting. How much of that is in scope for JuMP? I saw the notes about parsing and transforming into conjunctive normal form, that could get very complex depending on how many optimizations for speed to do. The dataframes syntax from the “ijklm” model looks really clear to me, and DataFrames is really fast already. I have some examples I’ve cooked up on using Catlab to do those manipulations that I’m trying to work into a blog post.