I am new to JuMP and trying to use it for maximum likelihood estimation. I found a very slow step when generating the expressions, and I am not sure how to improve it. Here is the code for a minimum working example:

M: number of locations; 20M
N: number of obs; 25M
X: data; 25M*100
location: The location of the observation
θ: # our parameters, dimension is (100,1)
model = Model() #generating the model
@variable(model, θ[i=1:100], start=θ_0[i])
@expressions(model, util = X*θ)
@NLexpressions(model,
begin
num[i=1:length(util)], exp(util[i])
num_sum[j=1:M], sum(num[findall(location.==j)][i] for i = 1:length(num[findall(location.==j)])) #this is the very slow step; I need to find all the observations that are in this location, j, and sum over all the num expressions
denom[j=1:M], 1+num_sum[j]
end
)

I tried multithreading but could not make it work (I am not sure if JuMP supports multithreading). Also, I tried to indirectly index the observations in a given location num_sum[j=1:M], sum(num[location.starting[j]:location.ending[j]][i] for i=1:length((num[location.starting[j]:location.ending[j]])), where location.starting and location.ending store the starting the ending indices for the starting and ending observations for a given location j. This actually slowed down the code. I was wondering if anyone has suggestions on how to improve the performance. Thanks!

Welcome! It’s more helpful if you can provide a fully reproducible example that someone can copy-paste to see what the problem is. But if I’ve understood your question correctly, I think you’re looking for something like this (I haven’t run, so there might be typos etc):

M = 20
N = 25
X = rand(N, 100)
locations = rand(1:M, N)
location_lookup = [Int[] for j in 1:M]
for (i, j) in enumerate(locations)
push!(locations_lookup[j], i)
end
model = Model()
@variable(model, θ[i=1:100])
@variable(model, y[1:N])
@constraint(model, y .== X * θ)
@NLexpression(model, denom[j=1:M], 1 + sum(exp(y[i]) for i in location_lookup[j]))

Your main problem is that location is a vector with 25,000,000 elements! So performing that look-up is expensive, and you are doing this 20,000,000 times! In situations like this, it can be helpful to step back and ask if there are ways to do less work, e.g., by doing the lookup once and caching the answer, instead of trying to use threading or similar.

Thank you so much for your help! This is exactly what I need. And you are right; I realized the costly step is to iterate through the whole location list just to find which observations are at that location but could not figure out a way to work around that issue. I think your suggested method works great!

May I ask a follow-up question? Would you mind explaining why did you specify y as a variable (@variable(model, y[1:N])) and then in the next line declare it as the constraint (@constraint(model, y.== X*θ)) versus writing it as the @expression?

You could try both ways because there’s a trade-off: adding extra variables and constraints can slow things down, but assuming you’ll use denom in a nonlinear constraint or objective, representing y as a variable can lead to a sparser Hessian which can be much faster to evaluate.