Writing legible JuMP models

I maintain a few large integer programming routines using the Python OR-Tools package. I have grown quite attached to the following code style, which breaks the compilation of the larger optimization problem into self-contained steps. Notice how the __init__ method serves as a “table of contents” for the overall compilation sequence:

(This isn’t a working example, just an illustration of style using a contrived problem class.)

from ortools.linear_solver import pywraplp


class BookshelfOptimizationProblem:
    """
    A class for the bookshelf optimization problem, with input data a, b, 
    and c, and decision variables x, y, and z.
    """

    # Global "config" options I don't want the user to mess with
    SOLVER_NAME = "SCIP"
    SOLVER_SHOW_OUTPUT = False
    SOLVER_TIME_LIMIT = 600

    def __init__(self, a, b, c) -> None:
        """
        Initialize an instance of of the bookshelf optimization problem.
        """
        self.a = a
        self.b = b
        self.c = c

        self.validate_input_data()
        self.initialize_solver()
        self.generate_decision_variables()
        self.generate_shelf_width_constraints()
        self.generate_shelf_height_constraints()
        self.generate_alphabetical_order_constraints()
        self.generate_objective_function()

    def solve(self, filename):
        self.optimize()
        self.validate_solution()
        self.save_solution_as_csv(filename)

    def validate_input_data(self):
        "Assert that a, b, and c have correct dimensions, bounds, etc"
        pass

    def initialize_solver(self):
        "Initialize the MILP solver."
        self.milp_solver = pywraplp.Solver.CreateSolver(self.SOLVER_NAME)
        self.milp_solver.SetTimeLimit(self.SOLVER_TIME_LIMIT * 1000)

    def generate_decision_variables(self):
        "Generate the decision variables, with sizes inferred from the input data."
        self.x = [self.milp_solver.BoolVar(
            f"x[{i}]") for i in range(len(self.a))]
        # self.y = ...
        # self.z = ...
        self.some_helpful_expression = f(self.x, self.y, self.a, ...)

    def generate_shelf_width_constraints(self):
        self.milp_solver.Add(sum(self.x) <= sum(self.b))

    def generate_shelf_height_constraints(self):
        pass

    def generate_alphabetical_order_constraints(self):
        pass

    def generate_objective_function(self):
        """
        Construct the objective function. This docstring includes some information
        about why this objective function was selected.
        """
        self.milp_solver.Minimize(sum(self.x) - self.some_helpful_expression)

    def optimize(self):
        "Solve the MILP using the backend defined in `self.SOLVER_NAME`."
        print(f"Solving problem using backend {self.SOLVER_NAME}.")
        if self.SOLVER_SHOW_OUTPUT:
            self.milp_solver.EnableOutput()
        else:
            self.milp_solver.SuppressOutput()
        status = self.milp_solver.Solve()
        return status

    def validate_solution(self) -> None:
        "Double check that the current solution satisfies certain derived properties."
        pass

    def save_solution_as_csv(self, filename):
        pass


if __name__ == "__main__":
    a = [1, 2, 3, 4, 5]
    b = [1, 2, 3, 4, 5]
    c = [1, 2, 3, 4, 5]

    problem = BookshelfOptimizationProblem(a, b, c)
    problem.solve("out.csv")

With JuMP and its dependence on macros such as @variable, I find myself using a much more linear coding style, which makes it harder to isolate problems or swap out components. The above code might “translate” to (again, not a working example):

import JuMP, SCIP

# Global "config" options I don't want the user to mess with
const SOLVER_NAME::String = "SCIP"
const SOLVER_SHOW_OUTPUT::Bool = false
const SOLVER_TIME_LIMIT::Int = 600

"Assert that a, b, and c have correct dimensions, bounds, etc"
function isvalidinputdata(a, b, c)
    return true
end

validatesolution(x, y, z) = ...
tocsv(x, y, z) = ...

function solvebookshelf(a, b, c, filename)
    # validate_input_data
    @assert isvalidinputdata(a, b, c)

    # initialize_solver
    model = Model(SCIP.Optimizer) # if SOLVER_NAME ...
    if !SOLVER_SHOW_OUTPUT
        set_silent(model)
    end

    # generate_decision_variables
    @variable(model, x[1:length(a)], Bin)
    @variable(model, y, ...)
    @variable(model, z, ...)
    @expression(model, some_helpful_expression, ...)

    # generate_shelf_width_constraints
    @constraint(model, ...)
    @constraint(model, ...)

    # generate_shelf_height_constraints
    @constraint(model, ...)
    @constraint(model, ...)

    # generate_alphabetical_order_constraints
    @constraint(model, ...)

    # generate_objective_function
    @objective(model, ...)

    # solve/optimize
    optimize!(model)

    x_vec = value.(x)
    y_vec = value.(y)
    z_vec = value.(z)

    # validate_solution
    validatesolution(x_vec, y_vec, z_vec)

    # save_solution_as_csv
    tocsv(x_vec, y_vec, z_vec)
end

To me, the Julia code is much less ergonomic:

  • Though I am able to extract functions at the beginning (isvalidinputdata) and at the end (validatesolution, tocsv) of the sequence, where I am working with “flat” vectors, the long blob of @constraint in the middle is not very reader-friendly. I cannot see a way to break up this blob other than creating very long function signatures like generate_shelf_height_constraints(model, x, y, z, a, b, c, some_other_helper_expression, ...) or abusing global scope.
  • The Python version makes it easier to separate my interaction (as developer) with the solver interface from the user (of my code)'s interaction with the problem class.
    • In the Python version, if (say) during the self.generate_shelf_height_constraints() step I can spot an easy way to detect problem infeasibility early on, then I can just modify the type signature to return a bool trivial_infeasibility_detected and handle it from the __init__ method.
    • In the Julia case, I have to bury all that logic right next to the constraint generation itself–so if the user clicks on a stacktrace saying “error at line 87,” instead of landing on a line that says
      if (trivial_infeasibility_detected := self.generate_shelf_height_constraints()):
         raise RuntimeError
      
      she is looking at 15 lines of @constraint followed by a cryptic arithmetic expression, and has to scroll up to see that this block of code is preceded by a comment explaining that this is the generate_shelf_height_constraints step.
    • (I hope this explanation illustrates the point, but let me know if you need a more specific example.)
  • The Python code plays nicely with code folding and navigation in common IDEs without having to manually define code regions.
  • The Python code allows me to defer calling the solve method until the reader is ready (helpful in an interactive context where you want first test your data read-in and then start solving).

Do you have any suggestions for improving the legibility and usability of the Julia code along these lines?


(Please don’t just link me to this article :wink: )

1 Like

I can provide some specific advice for your model when I’m at my computer and not on my phone, but before then, have you read

1 Like

Thank you, I hadn’t seen this yet. It solves many of my problems, but not (as I can now see) the main problem, which is how to isolate blocks of code responsible for generating a group of related constraints or expressions without having to spend a lot of redundant syntax passing around the names of the objects in my model.

For example, one pretty straightforward tip in that tutorial is to wrap the problem data a, b, c in a struct. But can I do the same with the decision variables x, y, z? I find the x = model[:x] syntax in this example to be quite clunky when repeated over many functions:

function add_knapsack_objective(
    model::Model,
    data::KnapsackData,
    ::AbstractConfiguration,
)
    x = model[:x]
    @objective(model, Max, sum(v.profit * x[k] for (k, v) in data.objects))
    return
end

In Python, for a model with many data and decision variable names, I would do (going by memory for some of these type annotations)

@dataclass
class ProblemData
   a: list[float]
   b: list[float]
   c: list[float]


@dataclass
class DecisionVariables
   x: list[pywraplp.BoolVar]
   y: list[pywraplp.NumVar]
   z: list[pywraplp.NumVar]

and then my main problem instance class would populate fields self.problem_data and self.decision_variables. Now I can access self.decision_variables.x from anywhere within the class.

From what’s in the tutorial, I think that generate_shelf_width_constraints() would look like

function generate_shelf_width_constraints(
    model,
    problem_data,
    ::AbstractConfiguration
) 
    x = model[:x] # lots of boilerplate
    y = model[:y] # in every
    z = model[:z] # function

    @constraint(model, ...)
end

and if I wanted to break the steps of this function up into further subfunctions, I would have to repeat the same boilerplate again (at least the relevant variables). It also kind of bugs me that the problem data is passed as an argument while the decision variables are retrieved as attributes of the model—a sort of awkward halfway between object oriented and functional.

Hmm, but I guess if I want to use model[:x] directly in my expressions, then there’s nothing stopping me…

JuMP variables are regular Julia types. You can create a struct to hold them, just like in Python:

struct DecisionVariables
    x::Vector{JuMP.VariableRef}
    y::Vector{JuMP.VariableRef}
    z::Vector{JuMP.VariableRef}
end

The model[:key] syntax is just a simplified way to access components of the model. If it doesn’t meet your needs, you can create a different data structure.

Is this the right way to do this?

using JuMP
using SCIP

struct Variables
    x::Vector{VariableRef}
    y::Vector{VariableRef}
    z::Vector{VariableRef}
end

function main()
    model = Model(SCIP.Optimizer)

    @variable(model, x[1:3])
    @variable(model, y[1:3])
    @variable(model, z[1:3])

    variables = Variables(x, y, z)

    @objective(model, Max, sum(variables.x))
end

main()

I’m coming around to it, but the aliasing of x and model[:x] and variables.x still makes my object-oriented brain nervous.

Is this the right way to do this?

Yes, that’s one way. One down-side to JuMP is that Julia is very flexible. That means there is no OneTrueWay to do things. It’s up to you, with a little bit of software engineering, to write something that makes sense for the application.

still makes my object-oriented brain nervous.

Embrace the change :laughing: : Julia is just data and functions which act on data. (The trick is designing the right data structure…)

1 Like

Maybe I’m missing something, but is model[:x] really that much different from self.x? You could even shorten it up a bit:

function foo(m::Model)
    m[:x]
    m[:y]
    # ...
end
1 Like

No, you aren’t missing anything, I’m just pointlessly irked by the inconsistency between x and model[:x] for variables and a and data.a for problem data :slight_smile:

I often use NamedTuples to aggregate parameters, variables and constraints to be passed around in JuMP models. They’re kind of like minimalistic structs that don’t need to be declared in advance, and there are neat shortcuts for packing and unpacking them. They can contain mixed types but Julia’s compiler still has no problem “seeing through” them so your code will be just as type stable as with structs. For example:

using JuMP
using SCIP

function runmodel(; a=default_a, b=default_b, c=default_c)
    params = (; a, b, c)    # short for params = (a=a, b=b, c=c)
    @assert isvalidinputdata(params)

    model = Model(SCIP.Optimizer)
    vars = generate_decision_variables(model, params)
    constraints = generate_constraints(model, vars, params)

    @objective(model, Max, sum(vars.x))

    optimize!(model)
    validatesolution(model, vars)
    # etc.
end

function generate_decision_variables(model, params)
    (; a, b, c) = params

    @variable(model, x[1:length(a)], Bin)
    @variable(model, y, ...)
    @variable(model, z, ...)
    @expression(model, some_helpful_expression, ...)

    return (; x, y, z, some_helpful_expression)
end

function generate_constraints(model, vars, params)
    (; a, b, c) = params
    (; x, y, z, some_helpful_expression) = vars

    # generate_shelf_width_constraints
    @constraint(model, ...)
    @constraint(model, ...)

    # generate_shelf_height_constraints
    @constraint(model, ...)
    @constraint(model, ...)

    # generate_alphabetical_order_constraints
    @constraint(model, ...)

    # return names of constraints here if you need to to reference them for e.g. shadow prices 
    return (; constraint_B, constraint_D)
end

runmodel(a=23, c=12)

No struct declarations needed - and guaranteed 100% free from OOP boilerplate too! :slight_smile:

4 Likes