Explicitly defining dependent variables vs allowing solver to figure it out

I have an energy system optimisation model with storage. The energy balance for the storage technology looks something like this:

e[t] = e[t-1] + eff * c[t] - d[t] 

The only real decision variables are the charging and discharging (c and d) for the energy level you only really need to specify one point. Most people still declare e a variable and hope that the solver will realise this however, just cause it’s simpler.

For various reasons, that’s not an option for me, so instead of a variable I define an array of expressions:

ax = 1:8760 # hours in a year
e = AxisArray(
	Array{GenericAffExpr,dimension}(undef, length.(ax)...),
	ax...
)
e0 = @variable(m, e0 >= 0) # Initial energy level, decision variable
e[1] = @expression(m, 1*e0)
for t in ax[2:end]
    e[t] = @expression(m, e[t-1] + eff * c[t] - d[t])
end

Doing this however is really slow! Much slower than just letting Gurobi figure this out. Not only that, but Julia crashes when it tries to solve this.

I could try to write something like this:

e[2:end] = @expression(m, [t=2:end], sum(eff * c[tt] - d[tt] for tt=1:t)

But my representation of time is more complex than that so it’s a lot harder than the above expression would make it seem. Also I’m not sure if that would help me with my problem of Julia crashing, so I’m reticent to do it.

Any suggestions? I assume I can’t be the only one to have faced such a problem. Is the problem that with the loop I’m nesting expressions?

Not only that, but Julia crashes when it tries to solve this.

What do you mean by “crash”

This is fast for me:

using JuMP
T = 8769
model = Model()
@variable(model, c[1:T] >= 0)
@variable(model, d[1:T] >= 0)
@variable(model, e0 >= 0)
@expression(model, e[t=0:T], 1.0 * e0)
e[1:T] .= @expression(model, [t = 1:T], e[t - 1] + c[t] - d[t])

By crash I mean that Julia just quits, and I need to restart it.

So you define the expression first and then redefine it correctly after? Seems sensible. I’ll try and run a benchmark to see which is faster.

By crash I mean that Julia just quits, and I need to restart it.

This shouldn’t happen. Can you provide code to reproduce? What versions are you using (import Pkg; Pkg.status())? What’s the output of versioninfo()?

So you define the expression first and then redefine it correctly after?

I’m not really sure why your original version is slow. I didn’t spend enough time looking into it. I’ll add it to my TODO list. The second one worked for me.

The code is part of a package I’m developping, so it’s several thousand lines and tens of files at this point. Here is the result of Pkg.status():

Project GEPPR v0.1.0
    Status `~/Desktop/RepDays4Storage/GEPPR.jl/Project.toml`
  [39de3d68] AxisArrays v0.4.2
  [336ed68f] CSV v0.5.24
  [9961bab8] Cbc v0.6.6
  [a93c6f00] DataFrames v0.20.2
  [864edb3b] DataStructures v0.17.9
  [31c24e10] Distributions v0.22.4
  [5789e2e9] FileIO v1.2.2
  [59287772] Formatting v0.4.1
  [60bf3e95] GLPK v0.12.1
  [2e9cd046] Gurobi v0.7.5
  [033835bb] JLD2 v0.1.11
  [4076af6c] JuMP v0.21.1
  [7eb4fadd] Match v1.1.0
  [b8f27783] MathOptInterface v0.9.10
  [bac558e1] OrderedCollections v1.1.0
  [91a5bcdd] Plots v0.29.1
  [d330b81b] PyPlot v2.8.2
  [1a8c2f83] Query v0.12.2
  [4c63d2b9] StatsFuns v0.9.4
  [f3b207a7] StatsPlots v0.14.0
  [ddb6d928] YAML v0.3.2
  [ade2ca70] Dates 
  [37e2e46d] LinearAlgebra 
  [2f01184e] SparseArrays

And versioninfo():

Julia Version 1.2.0
Commit c6da87ff4b (2019-08-20 00:03 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

I noticed that the output of your command is the following:

julia> e[0]
e0

julia> e[1]
e0 + c[1] - d[1]

julia> e[2]
e0 + c[2] - d[2]

This is probably why it’s so fast - it’s not what I want! What I actually want is the following:

julia> e[0]
e0

julia> e[1]
e0 + c[1] - d[1]

julia> e[2]
e0 + +c[1] - d[1] + c[2] - d[2]

Like I said in my post, I could write something like this (adapting your code):

using JuMP
T = 8760
model = Model()
@variable(model, c[1:T] >= 0)
@variable(model, d[1:T] >= 0)
@variable(model, e0 >= 0)

e = AxisArray(
	Array{GenericAffExpr,1}(undef, T),
	1:T
)
e[1] = @expression(model, 1*e0)
e[2:end] = @expression(model, [t=2:T], sum(c[tt] - d[tt] for tt=1:t))

This gives me this error, not sure why (I think the reasoning is sound):

ArgumentError: offset arrays are not supported but got an array with index other than 1
in require_one_based_indexing at abstractarray.jl:89

It also takes a while to produce this error (roughly 10 seconds), so I’m presuming this is also quite slow

I also tried doing this in a comprehension:

e = [@expression(model, e0 + sum(c[tt] - d[tt] for tt=1:t)) for t=2:T]
e = vcat(@expression(model, 1*e0), e)

This took 16 seconds. If I recall correctly, Gurobi is able to remove linearly dependent constraints (i.e. perform it’s presolve phase) in less than a second, so at least for this problem there seems to be no point in doing this.

1 Like

My mistake!

It seems you want:

using JuMP
T = 8769
model = Model()
@variable(model, c[1:T] >= 0)
@variable(model, d[1:T] >= 0)
@variable(model, e0 >= 0)
@expression(model, e[t=0:T], 1.0 * e0)
for t = 1:T
    e[t] = @expression(model, e[t - 1] + c[t] - d[t])
end

The issue was likely the definition of AxisArray wasn’t type stable.

Thank you, but my issue / question still remains - how should I decide when to use expressions instead of variables? If I use expressions I would think that I help the solver by explicitly telling it “these are dependent variables”, but as we’ve seen this is quite computationally expensive. It makes no sense to make an expression which takes 16 seconds to compute when the actual optimisation can be solved by Gurobi in less than 1 second.

I’m also starting to realise a similar issue. I use expressions quite often in my model since this makes for nicer code. However, it appears that this increases my computation time by quite a bit, so I should probably stop doing that. That’s quite annoying though!

I’m also a bit annoyed because my colleagues told me that JuMP was supposedly faster than GAMS because you pass “better” matrices to the solvers. This doesn’t seem to be true as far as I can tell, and trying to be smart about the matrices I pass to my solver actually makes things worse as we’ve seen here.

Admittedly I don’t know where my colleagues got this information - I don’t think they or I understood what “better” meant. I also don’t want to sound ungrateful - I really like JuMP and Julia and I really dislike GAMS. Nonetheless I feel like there must be room for improvement here.

Deciding to use variables or expressions is probably problem-specific. MILP solvers have very good pre-solves that can substitute out variables.

Did you try my code? It takes 6 seconds for me, not 16. It’s not super surprising that this takes a while. You’re doing an N^2 operation, creasing 8769 expressions, with up to 2 * 8769 terms! By my count, we’re creating 76.9 million non-zero terms.

JuMP is not faster than GAMS. In some cases, it is on-par. In other cases, it will be a little slower. Yes, there is room for improvement. Performance has been slipping recently with the transition to MathOptInterface. We’re aware of the issues.

1 Like

Thanks for the answer, this has cleared things up for me.

This is possibly because I have quite an old laptop, or possibly I misread the terminal output.