Performance issue when using kronecker product with ForwardDiff.Dual type

Hi there! Thanks for making a great reproducible example to start with. You can make it a little better by excluding the packages that aren’t relevant. For example, I didn’t need to download and install QueryVerse to run this example.

Two high level comments:

  • While JuMP supports user-defined functions, these should be used sparingly. Where possible, write out your expressions algebraically. This lets JuMP “see” the expression and it can perform more analysis (like computing second derivatives).
  • By default, user-defined functions use ForwardDiff to compute derivatives. Forward-mode automatic differentiation scales with the length of the input, so a function with 400 inputs will take ~400 times longer to evaluate the gradient than to evaluate the function itself.

So you can see, the combination of a lot of user-defined functions with large inputs is what gets you into trouble. There are also a bunch of allocations, like needing to allocate the Vector{T}(undef, 1) that don’t help.

I’d write your code a completely different way, as follows:

function optimization()
    model = Model(Ipopt.Optimizer)
    @variables(model, begin
        β, (start = -1.0)
        γ, (start = -1.0)
        W[i = 1:N, j = 1:N], (start = rand())
        penalty_1
        penalty_2
        g[i=1:size(i_t, 1), j=1:N]
    end)
    @expressions(model, begin
        e_t, s_t .- β * d_t .- γ * W * d_t_lag
    end)
    @constraints(model, begin
        [i=1:N], W[i, i] == 0.0
        [i=1:N], sum(W[i, :]) == 1.0
        [i=1:size(i_t, 1), j=1:N], g[i, j] == sum(i_t[i, t] * e_t[j, t] for t in 1:K)
    end)
    @NLconstraints(model, begin
        penalty_1 == sum(abs(w) for w in W)
        penalty_2 == sum(w^2 for w in W)
    end)
    @NLobjective(
        model, 
        Min, 
        sum(g[i, j]^2 for i in 1:size(i_t, 1), j in 1:N) +
        0.5 * penalty_1 + 0.5 * penalty_2,
    )
    optimize!(model)
    return model
end

model = optimization()

This runs, but it encounters an error. Unless I made an error somewhere, this is most likely because of the abs. Ipopt assumes functions are smooth and twice differentiable, so it often has numerical problems with abs.

Using the standard reformulation trick for abs of y_1 - y_2 == x, abs(x) := y_1 + y_2 where y_1, y_2 >= 0, I obtained:

function optimization()
    model = Model(Ipopt.Optimizer)
    @variables(model, begin
        β, (start = -1.0)
        γ, (start = -1.0)
        W[i = 1:N, j = 1:N], (start = rand())
        penalty_1
        penalty_2
        g[i=1:size(i_t, 1), j=1:N]
        W_plus[1:N, 1:N] >= 0
        W_neg[1:N, 1:N] >= 0
    end)
    @expressions(model, begin
        e_t, s_t .- β * d_t .- γ * W * d_t_lag
    end)
    @constraints(model, begin
        [i=1:N], W[i, i] == 0.0
        [i=1:N], sum(W[i, :]) == 1.0
        [i=1:size(i_t, 1), j=1:N], g[i, j] == sum(i_t[i, t] * e_t[j, t] for t in 1:K)
        W_plus .- W_neg .== W
        penalty_1 == sum(W_plus) + sum(W_neg)
        penalty_2 == sum(w^2 for w in W)
    end)
    @NLobjective(
        model, 
        Min, 
        sum(g[i, j]^2 for i in 1:size(i_t, 1), j in 1:N) +
        0.0 * penalty_1 + 1 * penalty_2,
    )
    optimize!(model)
    return model
end

model = optimization()

This solves in 10 iterations and 0.5 seconds, but I’ll leave it to you to check the solution is reasonable and that I didn’t make a typo somewhere.

1 Like