JuMP - defining problem by composing functions/expressions and mapping over collection

Hi, I’m new to both Julia and JuMP. Experience in C#, Scala and I like functional programming. Would love some help to get in to Julia optimisation.

How do I compose functions or expressions in JuMP? I can make a trivial example that works for a one-variable problem as follows, with the expressions to compose in the middle. It’s the step after that where I get stuck.

using JuMP, Clp

m = Model(Clp.Optimizer)

# Data
sourceMass_t = 1000.0
sourceGold_g = 600.0

# Fraction of each component taken
@variable(m, 0.0 <= f <= 1.0)

# Expressions to compose
@expression(m, mass_t, f * sourceMass_t)
@expression(m, gold_g, f * sourceGold_g)
@expression(m, cost_USD, 15.0 * mass_t)
@expression(m, recGold_TrOz, 0.90 / 31.1 * gold_g)
@expression(m, rev_USD, 1800 * recGold_TrOz)

# Net value objective
@objective(m, Max, rev_USD - cost_USD)

# Only constraint
con = @constraint(m, mass_t <= 700)

optimize!(m)  

The next step to make that slightly more interesting is to have multiple data records, so data becomes something like:

sourceMass_t = [1000.0, 1500.0, 2000.0]
sourceGold_g = [600.0, 700.0, 900.0]   

and objective and constraint change to become sums:

con = @constraint(m, sum(mass_t) <= 2200)
@objective(m, Max, sum(rev_USD) - sum(cost_USD))

The piece I’m missing is the ability to use the expressions composed already and apply to each pair of sourceMass_t and sourceGold_g. i.e. a JuMP-friendly map function, where each pair is given an associated f variable, rev_USD and cost_USD. In pseudocode:

map(the_composed_expressions, zip_of_sourceMass_t_and_source_Gold_g)  

I don’t believe I should need to rewrite all the expressions to accept arrays of variables and manually iterate over them; the function graph doesn’t change.

The goal after that would be to have many many more functions to compose and many more f fractions at different decision points, so I’m interested in the general approach here. Thanks.

Hi!

Can you share a MWE of your code that does not work, together with the error message you’re getting and a description of the result you’d want to get?

At first glance, it sounds like what you’re looking for is Julia’s vectorization (aka broadcast) syntax.
It’s hard to say anything more without the MWE; I can only point you to Julia’s docs on vectorization and JuMP’s docs on vectorized constraints.

1 Like

Thanks, the links did help, I had seen the dot operator syntax but didn’t know what it was for, and didn’t realise that applying one of those operators to two arrays would implicitly zip the array elements into pairs before applying the operator.

I therefore have the following code which does work, though I still have a couple of follow-up questions.

using JuMP, Clp

m = Model(Clp.Optimizer)

# Data
sourceMass_t = [1000.0, 1500.0, 2000.0]
sourceGold_g = [600.0, 700.0, 900.0]

# Fraction of each component taken
@variable(m, 0.0 <= f[i=1:3] <= 1.0)

# Expressions to compose
@expression(m, mass_t, f.* sourceMass_t)
@expression(m, gold_g, f.* sourceGold_g)
@expression(m, cost_USD, 15.0.* mass_t)
@expression(m, recGold_TrOz, 0.90 / 31.1.* gold_g)
@expression(m, rev_USD, 1800.0.* recGold_TrOz)

# Net value objective
@objective(m, Max, sum(rev_USD) - sum(cost_USD))

# Only constraint
con = @constraint(m, sum(mass_t) <= 2200)

optimize!(m)

Questions are:

  • It appears that I need to choose, in the expressions, to use either dot syntax or not. Is that right? I would still prefer to define the expressions without the dot syntax (i.e. scalar) if possible, and only later apply them to the source data via a map function. Is there a way of doing that?
  • If there is a way to do the above, I would also like to be able to define a scalar variable f at that level, which would then be vectorised itself during map.
  • I’ve put the sourceMass_t and sourceGold_t in two different arrays, but really they should be one array of pairs/tuples. What is the normal Julia way, that works with JuMP, of defining this data - e.g. array of named tuples or a dataframe or something else? I’m a little hazy on what limitations JuMP might impose on general Julia code - any guidance?

Whether to use the dot syntax depends on what you are manipulating.
Take the following setting:

model = Model()

u = 5.0              # u is a scalar
U = [1.0, 2.0, 3.0]  # U is a vector of Float64
n = 3

@variable(model, f)       # f is a scalar variable
@variable(model, F[1:n])  # F is a vector of variables

and consider the following expressions (I’m dropping the @expression for conciseness)

  • f * u: variable f times 5
  • f * U: not defined (scalar variable times vector)
  • f .* U: vector of length 3, with elements f x 1, f x 2, f x 3
  • F * u: not defined (vector of variables times scalar)
  • u .* F: vector of length 3, with elements 5 x F[1], 5 x F[2], 5 x F[3]
  • F * U: not defined (vector times vector)
  • F .* U: vector of length 3, with elements 1 x F[1], 2 x F[2], 3 x F[3]

JuMP should not do anything suspicious with the dot syntax, just follow the Julia normal behavior.
Again, to know whether you should use . or not depends on whether you’re manipulating vector or scalar quantities.

What is the normal Julia way, that works with JuMP, of defining this data

IMO: whichever works and you’re most comfortable with :slight_smile:
For small data, you shouldn’t notice a big difference. For large datasets, you just want to avoid unnecessarily copying and moving around your data.
For instance,

for (u, v) in zip(U, V)
    # something with u and v
    # ...
end

will not copy anything, it will just create an iterator over pairs of elements in U and V.
However, if you do

n = length(U)
W = [(U[j], V[j]) for j in 1:n]  # this will allocate
for (u, v) in W
    # ...
end

you will potentially copy the original U and V. Plus, AFAIK, iterating over W is not faster than a zip(U, V).

what limitations JuMP might impose on general Julia code

The nice part of having JuMP embedded in Julia, is that it does not impose any limitation on general Julia code.

2 Likes

Thanks @mtanneau, that is lots of good information and helps me get a better grasp on the ins and outs of how scalars and vectors are defined in Julia. I also found some related reading such as the Julia blog here, which includes a comparison of broadcast and map. It seems the best approach for my current usage is just to use vector operations for everything.

I still wonder if it’s possible to define a scalar variable in a JuMP expression that is later ‘lifted’ into a vector of variables when mapped against an array, but that might be something I experiment with later. Perhaps when I learn macros I can write a transformation. If I work something out I’ll post it back here.

I assume the for j in 1:n syntax below is just syntactic sugar for a regular for loop that declares and then fills the W array.

It seemed to work that way in the REPL, anyway.
I’ll try be mindful of performance considerations though I’m sure that’s a deep topic, so something for later.

I think the technical term for that is “comprehension” (see this part of the Julia documentation) ; this also exists in Python (as in this example).
Sometimes the compiler can do something smarter than just allocate an array and fill it.

It seems you have discovered the motivation for Rima: GitHub - geoffleyland/rima: Rima is a tool for formulating mathematical models Here’s the paper to read: rima/Rima-ORSNZ-paper-2010.pdf at master · geoffleyland/rima · GitHub.

The piece you’re missing, is that @expression(m, mass_t, f * sourceMass_t) doesn’t define an algebraic expression that we later plug the values of f and sourceMass_t into, ala Pyomo or AMPL. It constructs a concrete expression using the values of the variables at that point in time.

However, following up on @mtanneau’s suggestions, you might want to consider a style like the following, which leverages Julia’s functions and broadcast to allow you to “re-use” scalar expressions.

foo(model, a, x) = @expression(model, a * x)

model = Model()
@variable(model, x)
a = 1.0
f1 = foo(model, a, x)

@variable(model, y[1:2])
b = [1.0, 2.0]
f2 = foo.(model, b, y)

Or you could leverage multiple dispatch:

foo(model, a, x) = @expression(model, a * x)
foo(model, a::Vector, x::Vector) = @expression(model, a .* x)
f3 = foo(model, b, y)
1 Like

Thanks @odow, Rima looks very interesting - very much along the lines of what I was thinking re addressing unwanted dependencies between variables and expression, plus a layer of abstraction above arrays & matrices etc. I’ll check it out.

The multiple dispatch approach requires the same equation to be defined twice, which isn’t so good, but there does seem to be a benefit in wrapping in a scalar function as in foo(model, a, x) = @expression(model, a * x), I’ll play around with that one more.

For the record, I don’t think @gleyland is actively developing Rima anymore (He switched to JuMP!). I linked it more for interest – and it’s good inspiration for features to add to JuMP over the long-term.

1 Like

Hi,

Sorry, I’m way late on this one. @odow is right, I’m not working on Rima any more and I am a fan of JuMP. That’s not to say there weren’t some good things in Rima.

The “main feature” of Rima, however, that you could compose a model from parts, possibly isn’t one of them: I haven’t missed it in several years of building models. It was inspired by working on an electricity dispatch model, and at different times, thinking “it would be handy just to add a generator as an object”, and then, when I had to change a GAMS model from a single time period to multiple time periods (which involved adding a t index and changing every line of the model), thinking it would be nice to be able to define a model and then say “I want 10 of these, and here are the linking constraints”.

But, as I said, I haven’t really missed it (though I do currently have model that will take some work to change from a single time period to multiple periods…)

The things I miss are the “model and data are separate” bit, and the partial evaluation. In JuMP, if I define a model, and then print it, because the model and data are combined, I get the whole model in all its gory detail. In Rima, I defined a model, then could print it, and get a succinct description, and then add bits of data in, and see the model fill itself in.

Cheers,
Geoff

2 Likes

it would be handy just to add a generator as an object

PowerSimulations.jl has shown that you can use Julia’s multiple dispatch to do exactly this!

The printing story is a bit more difficult. I think we’ve gone the correct way only having concrete models. Pyomo’s abstract model has performance issues.

We could probably do a better job at printing a concise summary of the model: Print constraints/variables in set form instead of scalarized · Issue #2171 · jump-dev/JuMP.jl · GitHub. One thing we could potentially do is cache the constraints as inputted to the macros and print those.

1 Like

Thanks for picking this up, yes those pain points are exactly what I’m talking about. My usage is very similar to the ‘just add an extra generator to a power system’ one you describe, but for chemical/mining and industrial processes. And in general I just like having models as decoupled as possible - a graph of models that can work together or standalone, ideally.

Thanks @odow too for the link to PowerSimulations, some time later I might look at that for inspiration.

Here’s how I would implement the multiple knapsack problem from the Rima paper in JuMP. It’s less flexible, in that you have to plan what information you return from a function and you need to use anonymous variables, but it still reads pretty well.

using JuMP

struct Item
    weight::Float64
    profit::Float64
end

struct Knapsack
    x::Vector{VariableRef}
    objective::AffExpr
end

function knapsack(model, items::Vector{Item}, capacity::Float64)
    N = length(items)
    x = @variable(model, [i=1:N], Bin, base_name = "x_$(capacity)")
    @constraint(model, sum(items[i].weight * x[i] for i = 1:N) <= capacity)
    @objective(model, Max, sum(items[i].profit * x[i] for i = 1:N))
    return Knapsack(x, objective_function(model))
end

function knapsack(model, items::Vector{Item}, capacities::Vector{Float64})
    knapsacks = [knapsack(model, items, capacity) for capacity in capacities]
    @objective(model, Max, sum(k.objective for k in knapsacks))
    return knapsacks
end

function add_side_constraint(model, knapsacks::Vector{Knapsack})
    N = length(knapsacks[1].x)
    return @constraint(model, [i=1:N], sum(k.x[i] for k in knapsacks) <= 1)
end

items = Item.(Float64[2, 8, 4, 2, 5], Float64[5, 3, 2, 7, 4])
capacities = [8.0, 10.0]

model = Model()
model, knapsacks = knapsack(model, items, capacities)
add_side_constraint(model, knapsacks)
1 Like

@NickRedwood Please share the final working code for greater community learning. I’ve similar problem and still couldn’t solve it. Thanks.