Modular implementation of optimization problems with JuMP

Hello,

This is a follow-up of my question on the thread about dynamic JuMP variable names Using anonymous and or symbols to create dynamic varible names? - #3 by pierre-haessig.

I’m sharing here a JuMP code design study on how to implement single/multi stage variants of the same optimization problem in a modular way, i.e. sharing as much common model code as possible.

I’m interested in your thoughts on this code design.

Motivation

In my use case, modularity is welcomed because each stage has more or less the same variables/constraints. This differs from many exemplar two stage programs, such as the farmer example implemented in InfiniteOpt.jl/Two-Stage Stochastic Program, where first and second stage variables are different (and named as such x vs y(\xi)).

Study notebook

It would be a bit long to paste the example, so here is the link to the Jupyter notebook in a Gist. Notes:

  • The example is a toy energy system optimization problem, i.e. with optimal sizing and operation jointly solved. Since there is only one power source, the trivial solution is that Pgen should be the maximum of the requested load time series, i.e. the peak load.
  • The two-stage aspect comes from considering two subsequent periods, that is allowing a mid-project generator resize.
  • The stochastic aspect comes from the second stage load being uncertain, so different resize values are to be considered.

Also notice that I arrived at two different code designs (“A” and “B”, named chronologically) and in the end I prefer the simplicity of option “B”.

Summary of “Modular implementation B”

The key building block is the inner stage problem wrapped in a function (m is JuMP Model, dat is a Dict{String,Any} to store variables)

function optim_model_stage_B!(m, dat, load; stage="")
    K = length(load)

    # Variables
    Pgen = add_stage_variable_B!(m, dat, "Pgen"; stage=stage)
    gen = add_stage_variable_B!(m, dat, "gen"; K=K, stage=stage)
    
    # Constraints:
    dat["balance"] = @constraint(m, [k=1:K], gen[k] == load[k], base_name="balance$stage")
    dat["Pgen_cons"] = @constraint(m, [k=1:K], gen[k] <= Pgen, base_name="Pgen_cons$stage")

    # scenario cost
    dat["cost"] = Pgen + sum(gen)
end

This uses an add_stage_variable_B! wrapper of @variable macro to avoid having name conflict in the JuMP Model, in the spirit of Using anonymous and or symbols to create dynamic varible names? - #3 by pierre-haessig. Instead of using the JuMP Model, variables are stored in the Dict.

About not wrapping @constraint

Notice that in the above example, I use a wrapper for @variable, but not for @constraint.
I find that the code looks a bit unbalanced (asymetrical) in that aspect, but I was unable to create a constraint wrapper allowing the large formulation flexibility permitted by @constraint. As an example consider “Pgen_cons” above which is a vector constraint mixing vector and scalar variables.

Single stage usage

This stage problem can be used as a single stage (i.e. a classical deterministic energy system sizing), after setting the objective (notice @objective macro is not called inside optim_model_stage_B!, but rather defines dat["cost"] which is only a fragment of the global cost in a multi-stage context):

m = Model(HiGHS.Optimizer)
data = Dict{String,Any}()
load = [1., 1.5, 1.2, 0.9] # dummy problem data

optim_model_stage_B!(m, data, load)

@objective(m, Min, data["cost"])
optimize!(m)
Pgen_val = value(data["Pgen"])
println("Pgen=$(Pgen_val)") # Pgen=1.5

Two-stage stochastic program usage

Remember that the goal of this code design study is that the inner stage can be reused for a larger two-stage problem. And indeed, 1+n_s stage instances are created, that is 1 for the first stage, and 1 for each of the n_s stochastic scenarios for the second stage.

# second stage data: 3 equally probable load changes
load_scenar = [0.5, 1.0, 2.0] # (decrease, constant, increase)
ns = length(load_scenar)
proba_scenar = [1., 1., 1.]/ns

m = Model(HiGHS.Optimizer)

data1 = Dict{String,Any}()
optim_model_stage_B!(m, data1, load; stage=1)

data2 = [Dict{String,Any}() for s=1:ns]
for s=1:ns
    load_s = load*load_scenar[s]
    optim_model_stage_B!(m, data2[s], load_s; stage="2$s")
end

Then, there is more work to define the global problem by joining the different stages, either in the objective and/or constraints. Here are examples of both:

  • cost is the sum of expected stage costs
  • a coupling constraints forbids having 2nd stage generator smaller than at the 1st stage
cost2 = [data2[s]["cost"] for s=1:ns]
cost2_exp = cost2' * proba_scenar
@objective(m, Min, data1["cost"] + cost2_exp)

@constraint(m, [s=1:ns], data1["Pgen"] <= data2[s]["Pgen"])

Optimization results are retrieved by digging into the Dict containers of each stage:

optimize!(m)
Pgen1_val = value(data1["Pgen"])
println("Pgen1=$(Pgen1_val)") # Pgen1=1.5
Pgen2_val = [value(data2[s]["Pgen"]) for s=1:ns]
println("Pgen2=$(Pgen2_val)")
# Pgen2=[1.5, 1.5, 3.0] and not [0.75, 1.5, 3.0]
# due to the stage coupling constraint

That’s it, any thoughts welcome!

Take a read of Design patterns for larger models · JuMP

Yes, this doc section was part of my “literature review” since you had already mentioned it on Stack Overflow or here, so it’s in the “References” section of my notebook.

Most parts (wrapping the model in a function, having subfunctions to create variables or constraints) do apply (and are applied) to my use case. Only I didn’t need dedicated data structures like KnapsackObject.

So the only aspect I didn’t find in that doc (and that was my motivation for making this code design study and posting it here) was the ability to instantiate several replicas of more or less the same inner problem inside one larger problem. Thus my interest in dynamic variable naming. I the end, I decided to use anonynous variables, identified by their String key in a Dict. Having one Dict per inner problem is like having a dedicated namespace for each inner subproblem, so this solves the duplicate name clashes. Still, as a bonus, I append an optional “stage” suffix to the base_name of each variable, so that the String/Latex representation is less ambiguous.

1 Like

I just came across the 2019 post Pyomo style "blocks" in JuMP by @jonmat because the more I dig into it, the more I realize that I’m indeed in search of the “namespace” feature mentioned here (and I didn’t know that Pyomo provides this). Indeed, for some other project, I’m may be looking for a Modelica-style object/component oriented structuring of JuMP model.

Now, I understand that in the present state of JuMP it’s indeed possible to implement such things using anonymous variables. However, looking at my “add_variable” wrapper of the classical @variable macro (see below), I feel like I’m just creating a more cumbersome version of @variable (see e.g. the presence of lb and ub keyword args since I cannot use the nice lb <= x <= ub syntax without a macro), for the only reason that I want to store the VariableRef in my container of choice (see model_data::Dict{String,Any}) rather than in the existing global Model container of registered variables.

So my question is: what would be the difficulty to bake this “reference storage hijacking” directly in the @variable macro? Or should I copy-paste the @variable source code and try to adapt it for my use case?

my “add_variable” wrapper

This is the present state of my custom “add_variable” wrapper around @variable (early version discussed in Using anonymous and or symbols to create dynamic variable names? - #3 by pierre-haessig). Main features are:

  • uses anonymous variables to avoid registration in the global namespace of Model
  • but instead store the VariableRef in a Dict
  • and then returns the VariableRef to make it easy to bind it to a similarly name Julia variable in the caller.

but to get these features, as said above, I have to pay the price of a much worse API compared to the original @variable.

"""
    add_var!(model_data::Dict{String,Any}, name::String;
             lb=nothing, ub=nothing, fix=nothing, K::Int=1, stage="")

add a variable to the JuMP `Model` `model_data["model"]` and store the reference
to that variable in `model_data[name]`.

Returns the variable reference.

`name` is also used as the `base_name` of the JuMP variable (used for printing),
with the optional `stage` argument (`Int`, `String`...) used as a suffix.

# Other optional arguments:
- lower and/or upper bounds can be set with `lb` and `ub`
- equal bounds can be set with `fix`
- the variable can be a Vector of length `K` if `K` >1 (timeseries).
"""
function add_var!(model_data::Dict{String,Any}, name::String;
                  lb=nothing, ub=nothing, fix=nothing, K::Int=1, stage="")
    model = model_data["model"]
    name_suffix = "$name$stage"
    
    if K==1
        v = @variable(model, base_name=name_suffix)
    elseif K>1
        v = @variable(model, [1:K], base_name=name_suffix)
    else
        throw(ArgumentError("`K` array size parameter should be >=1"))
    end
    model_data[name] = v

    # Bounds:
    if (lb!==nothing || ub!==nothing) && fix!==nothing
        throw(ArgumentError("`lb` and `ub` bounds cannot be used in conjunction with `fix`"))
    end
    if fix !==nothing
        fix.(v)
    end
    if lb !==nothing
        set_lower_bound.(v, lb)
    end
    if ub !==nothing
        set_upper_bound.(v, ub)
    end
    
    return v
end

which is to be used as

# One shot model setup
md = Dict{String,Any}()
md["model"] = Model(HiGHS.Optimizer)
# add piles of variables:
x  = add_var!(model_data, "x",  lb=0.0, ub=power_rated_gen_max)
y  = add_var!(model_data, "y",  lb=0.0, ub=power_rated_gen_max)

In almost all cases, writing your own macro is the wrong thing to do.

You could also use the regular @variable macro, and then call unregister to free up the game again: JuMP · JuMP

Yes I agree, it would be a lot of work to make a worse/more fragile copy of the original one.

So perhaps a variant of my question is: is it imaginable that a future version of the regular @variable macro would accept an optional “model container/data” keyword that would store the VariableRef instead of the model itself.

The idea would be to condense the pattern of the last three line into just one line:

data = Dict{String,Any}()
model = Model(HiGHS.Optimizer)

@variable(model, 0 <= x <= 1)
unregister(mode, :x)
data["x"] = x

(and same for constraints)

Just do:

data = Dict{String,Any}()
model = Model(HiGHS.Optimizer)
data["x"] = @variable(model, 0 <= x <= 1)
unregister(model, :x)

From a code-readability standard, I think that writing something slightly more verbose that uses the standard JuMP features than trying to writing something custom that saves a line of typing but which is non-standard.

You’re right, it occurred to me afterwards that the Dict registration line could be merged with the @variable declaration line. So the final “code writing cost” is:

  • writing the variable name three times, but at least these duplications are contiguous, so little risk of getting out of sync
  • one extra line for unregister

About you’re final comment, I’m assuming that you meant something like

“From a code-readability standard, I think that writing something slightly more verbose that uses the standard JuMP features [is better/safer/simpler to read and maintain…?] than trying to writing something custom that saves a line of typing but which is non-standard.”

and I agree.

Thanks for your many feedbacks.

1 Like

The duplication of the variable name and the unregister line could be eliminated with anonymous variables, at the cost of uglier syntax for the upper/lower bounds: Variables · JuMP

Yes, with anonymous variables come the need to use the lower_bound/upper_bound keywords. Then this starts to look like my custom add_var! wrapper mentioned above that I’d like to avoid.

But the main argument against anonymous variables is the need to add the variable to the current namespace (I say “need” in the sense that it’s handy for writing subsequent model lines). Plus, using anonymous variables needs setting the base_name attribute (here “need” means only that is useful when printing expressions or constraints)

In the end, it’s about trade-offs:

data["x"] = @variable(model, 0 <= x <= 1)
unregister(model, :x)

versus

x = @variable(model, lower_bound=0, upper_bound=1, base_name="x") # base_name is optional, only for prints
data["x"] = x # and this may be compressed into data["x"] = x = @var...

Now, thinking a bit more, if I’m sure I want to unregister all registered symbols, it’s possible to do it in one loop at the end using object_dictionary(model) mentioned in How to get all the registered variable names of a `JuMP` model?. The model definition would then look like this:

data["x"] = @variable(model, 0 <= x <= 1)
data["y"] = @variable(model, 0 <= y <= 1)
@constraint(model, x+y <= 1) # here comes the interest of having x and y in the namespace
# ...

for var in keys(object_dictionary(model))
    unregister(model, var)
end