Is JuMP suitable for very big problems?

This question has certainly been asked, but I couldn’t actually find a clear answer.

Having the following problem:

n = 10000;
k = rand(n, n)
ke = 0.5(k + k');

function func(x::T...) where T
    y = collect(x)
   return y' * ke * y
end 

model = Model(Ipopt.Optimizer)
@variable(model, x[1:n])
@objective(model, Min, func(x...) );

generating the model is really painfull when n > 10^4 and I want it to be like 10^6. In my real problem ke is sparse and optimization itself is quite quick but for n=10^5 I had to wait 11h for executing objective(model, Min, func(x…) ) whereas optimization took 8 minutes. Is there another way to speed things up ?

I thought it was because automatic differentiation but providing gradient and hessian makes everything even slower, e.g.

function func_grad(g::AbstractVector, x...)
    y = collect(x)
    g .= ke * y
   return
end 

function func_hess(h::AbstractMatrix, x...)
   h .= ke
   return 
end 

model = Model(Ipopt.Optimizer)
@variable(model, x[1:n])
@operator(model, op_f, n, func, func_grad, func_hess)
@objective(model, Min, op_f(x...) );

btw.: if ke is sparse, h is also?

I would expect that providing gradient and hessian speed up performance but it is opposite. Is it something I can do about it ?

JuMP handles affine and quadratic expressions in a special way and will take care of generating gradient and hessian efficiently.
About your custom gradient and hessian, you have here a function with has a large number of inputs with splatting so it would be better to pass the argument as a vector. That is however not yet supported by the nonlinear expression interface, it’s in the roadmap (see Roadmap · JuMP).
So it’s best to focus on trying to build this quadratic expression as efficiently as possible. Here are a few options:

function func(ke, x::T...) where T
    y = collect(x)
    return y' * ke * y
end

no_splat(ke, y) = y' * ke * y

using JuMP

function jump(ke, y)
    acc = zero(QuadExpr)
    for i in axes(ke, 1)
        for j in axes(ke, 2)
            JuMP.add_to_expression!(acc, ke[i, j], y[i], y[j])
        end
    end
    return acc
end

import MathOptInterface as MOI

function moi(ke, y)
    acc = zero(MOI.ScalarQuadraticFunction{Float64})
    for i in axes(ke, 1)
        ti = MOI.ScalarAffineTerm(1.0, index(y[i]))
        for j in axes(ke, 2)
            tj = MOI.ScalarAffineTerm(ke[i, j], index(y[j]))
            t = MOI.Utilities.operate_term(*, ti, tj)
            push!(acc.quadratic_terms, t)
        end
    end
    return acc
end

function bench(n)
    k = rand(n, n)
    ke = 0.5(k + k');
    model = Model()
    @variable(model, x[1:n])
    @btime @objective($model, Min, $func($ke, $x...))
    @btime @objective($model, Min, $no_splat($ke, $x))
    @btime @objective($model, Min, $jump($ke, $x))
    @btime @objective($model, Min, $moi($ke, $x))
    return
end

bench(2000)

gives

  935.334 ms (46072 allocations: 573.62 MiB)
  933.363 ms (44068 allocations: 573.51 MiB)
  529.620 ms (67 allocations: 215.56 MiB)
  34.654 ms (23 allocations: 201.63 MiB)

The second one is just to show that you don’t need splatting (you only have to do it if you use an @operator.
The third one is a bit faster since it’s does not create the auxiliary vector of affine expressions ke * y. However, a big cost is still due to the fact that JuMP stores quadratic expressions with a dictionary which is nice when you have a lot of duplicate but it doesn’t help in your case.
The fourth one uses MOI directly and it’s much faster. There is a small detail with MOI.ScalarQuadraticTerm that the coefficient should be scaled by 2 in some cases so I’m using MOI.Utilities.operate_term with affine term to have this handled automatically.

For sparse matrices, can adapt the moi function to iterate only over the nonzero elements.

cc @odow maybe this can be moved to the optimization category

1 Like

Thank you! I will try to tweak it to my real problem.

Hi @mrys, welcome to the forum!

I’ve moved this to the optimization category.

1 Like