Creating Variables from a Vector of Tuples and Naming Them

I have converted some PuLP code to Julia and have three vectors of Tuples.

supplylist: 149988-element Vector{Tuple{String15, String31}}
flowslist: 499572-element Vector{Tuple{String15, String15, String31}}
edgelist: 41631-element Vector{Tuple{String15, String15}}

I used the following code to create variables:

@variable(model, x[(i,j,k) = flowslist] >= 0.0)

@variable(model, y[(i,j) = edgelist] >= 0.0)

@variable(model, z[(i,j) = supplylist] >=0.0)

This seemed to work but then when I tried to use:
set_name(x, “massflowtype”)

I got this error:

MethodError: no method matching name(::JuMP.Containers.DenseAxisArray{VariableRef, 1, Tuple{Vector{Tuple{String15, String15, String31}}}, Tuple{JuMP.Containers._AxisLookup{Dict{Tuple{String15, String15, String31}, Int64}}}})

Any insights as to what is happening here? The lists are designed to be looped through and matched so that x[i,j,k] can be multiplied and matched according to values. Here is a piece of the mass balance equation for the PuLP model:

for item in supplylist:
cn_coal_problem += (supplyitembynode[item]+
lpSum([massflowtype[(i,j,ct)] for (i,j,ct) in flowslist if (j,ct) == item]) >=
lpSum([massflowtype[(i,j,ct)] for (i,j,ct) in flowslist if (i,ct) == item]))

You’re looking for

@variable(model, x[(i,j,k) = flowslist] >= 0.0, base_name = "massflowtype")

set_name applies only to single variables, not to collections of them. See Variables · JuMP.

The lists are designed to be looped through and matched so that x[i,j,k] can be multiplied and matched according to values

Per your question on StackOverflow, julia - JuMP looping through and matching two different objects - Stack Overflow, I’m not sure that this is the right data structure for your problem. It’s reminiscent of JuMP, GAMS, and the IJKLM model | JuMP.

If you have a small reproducible example of what you currently have, we can probably point you in a better direction.

How best to post all the dataprep etc? Its quite a lot of ETL etc so do not want to waste everyones time. Anyone who wants some consulting work I’m about at the point where I pay up.

I guess one option is to start with the nested for loops. If it’s easy to write and understand, that’s all you need. It’ll be faster than PuLP. But you might still run into scaling issues. (Although perhaps your fine with the runtime.)

If it becomes a problem, here’s another option:

using JuMP
import DataFrames
df = DataFrames.DataFrame(
    origin_node = ["A", "A", "B", "C"],
    destination_node = ["C", "C", "C", "A"],
    coal_group = ["x", "y", "y", "y"],
)
model = Model()
df.x = @variable(model, x[1:size(df, 1)] >= 0, base_name = "mass_flow")
locations = union(df.origin_node, df.destination_node)
for (index, gdf) in pairs(DataFrames.groupby(df, :coal_group))
    @constraint(
        model, 
        [l in locations],
        sum(r.x for r in eachrow(gdf) if r.destination_node == l) ==
        sum(r.x for r in eachrow(gdf) if r.origin_node == l),
    )
end

Thanks will try this. The issue is that with ~10k origins, ~10k destinations I’ve got mild PTSD of trying to run every edge through (10k by 10k matrix with a lot of Inf cost values) on PuLP and suspect Julia might not love it either. Examples in the JuMP docs suggest this but I doubt it would scale well.

PuLP model compile runtime was around 6 hours and we got a lot of suggestions there. I think we have pretty much exhausted what can be done there.

If you have sparse origin-destination pairs, don’t loop over every possible. Just loop over the ones that exist? Precompute dictionaries of origin → list of destinations and destination → list of origin.

This is the same problem as the GAMS blog post. Mathematically, it’s nice to write out the nested summation, and standalone tools like GAMS are designed to rewrite the math into something that is efficient. JuMP and PuLP are different, in that they execute what is written, so it’s up to you to implement the sparsity. There are upsides to being more flexible, but there are also downsides, in that the “simple” case is no longer quite so simple.

Anyone who wants some consulting work I’m about at the point where I pay up.

I sent you a private chat.

Yup so those dictionaries have been precomputed. Still big, but bearable. So just need to loop over those.

What’s the total number of non-Inf cost arcs? How many variables and constraints in the final model? PuLP is the wrong tool for large models.