JuMP.jl vs Optimization.jl

aris · April 22, 2024, 9:19am

I’m following this paper, which uses this method (Butler-Reeds-Dawson 1981 algorithm) to greatly speed up a large Tikhonov regularization problem.

min ||Kf-g||^2 +||αf||^2
s.t. f \geq 0

I wanted to reproduce this method in Julia, so that I can move away from the non-adaptable matlab GUI we’ve been using in our group so far.
Performance is good to have, since this code is used routinely on large sets of data.

There’s another thread where people kindly provided suggestions for solving this problem. Some of these methods provide effectively the same results as the ones by the Optimization.jl method suggested above in this thread. The fastest I’ve tested so far would be this JuMP implementation:

JuMP/Gurobi method

        model = Model(() -> Gurobi.Optimizer(GRB_ENV))
        set_silent(model)
        @variable(model, f[1:size(K, 2)] >= 0)
        @variable(model, r[1:size(K, 1)])
        @constraint(model, r .== g .- K * f)
        @objective(model, Min, LinearAlgebra.dot(r, r) + α * LinearAlgebra.dot(f, f))
        optimize!(model)
        @assert is_solved_and_feasible(model)
        return value.(f), value.(r)

followed by the optimization method suggested above, modified as follows:

Optimization.jl method

function minfun(c::AbstractVector, p=(α, data, K))
    G = copy(p[3])
    G[:, (p[3]'*c.<=0)] .= 0
    f = 0.5 * c' * ((p[1] * I + (G * p[3]')) * c) - c' * p[2]
end

function grad(gradient, c, p=(α, data, K))
    G = copy(p[3])
    G[:, (p[3]'*c.<=0)] .= 0
    gradient .= (p[1] * I + G * p[3]') * c - p[2]
end

function hess(H, c, p=(α, data, K))
    G = copy(p[3])
    G[:, (p[3]'*c.<=0)] .= 0
    H .= p[1] * I + G * p[3]'
end

optf = OptimizationFunction(minfun, grad=grad, hess=hess)
prob = OptimizationProblem(optf, ones(size(K, 1)), (α, g, K))
c = solve(prob, NewtonTrustRegion(), x_tol=1e-8, maxiters=5000, maxtime=100)
f = vec(max.(0, (K' * c)))

(although I reckon allocations on the latter could be greatly reduced by not using deepcopy for G at each call, potentially making it the fastest).

Well, I’d also like to optimize α at the end of the day.
Currently I’m following the method suggested here, where they use a GCV method for that. They effectively try to minimize a “score” parameter produced by the function below

GCV score function

function gcv_score(α, s, r) # where r is the residuals of the solution
    ñ = length(s)
    c = r ./ α[end]
    σ² = s .^ 2
    m̂ = sum(σ² ./ (σ² .+ α[end]))
    dm̂ = sum(σ² ./ ((σ² .+ α[end]) .^ 2))
    t̂ = sum((svds.s .* c) .^ 2 ./ (σ² .+ α[end]))
    φₙ = (α[end] .^ 2 * c' * c .* ñ) ./ ((ñ - m̂) .^ 2)  # GCV score to be minimized
    αₙ = (α[end] .^ 2 * c' * c * dm̂) / (t̂ * (ñ - m̂))  # α update value, test this one next

    return φₙ , αₙ
end

function gcv_score(α, s, c) # where c is the vector from the Optimization.jl method
    ñ = length(s)
    σ² = s .^ 2
    m̂ = sum(σ² ./ (σ² .+ α[end]))
    dm̂ = sum(σ² ./ ((σ² .+ α[end]) .^ 2))
    t̂ = sum((svds.s .* c) .^ 2 ./ (σ² .+ α[end]))
    φₙ = (α[end] .^ 2 * c' * c .* ñ) ./ ((ñ - m̂) .^ 2)  # GCV score to be minimized
    αₙ = (α[end] .^ 2 * c' * c * dm̂) / (t̂ * (ñ - m̂))  # α update value, test this one next

    return φₙ , αₙ
end

This fuction returns the objective value which needs to be minimized, as well as the next α to try. The plan is to just evaluate this starting with a large α, and evaluate the score repeatedly until a minimum value is found (if the new α is larger than the previous one, you stop).
Frankly, I’ve got no idea how this formula is derived and what the sensitivity on the optimal α would be in this case, but it’s pretty fast and converges within 4-7 iterations in most cases.

I’m aware you’ve previously recommended the following.

and I did try to play around with it, using OptimizationBBO, but I haven’t found a method yet which converges as fast as the “α update” method above. Still trying to work on that though, and any suggestions would be welcome.

The OneDrive link above has been updated to include the “s” vector required by the gcv function, in case someone wants to play around with it. You will also find inside some .pdf files of the papers cited above (if someone does not have access).

Topic		Replies	Views
Connecting a simple first-order solver to solve standard form linear program to JuMP Optimization (Mathematical) jump	37	1236	August 16, 2023
Nonlinear Optimization with Many Constraints + Autodifferentiation: Which Julia Solution? Optimization (Mathematical) jump , ipopt , nonlinear-optimizati	23	924	March 15, 2024
The function to optimize in JuMP Optimization (Mathematical)	24	3809	August 29, 2017
JuMP: Constraint with a huge expression Optimization (Mathematical) jump	25	1508	February 8, 2024
How to modify JuMP model in-place to improve performance? Optimization (Mathematical) jump	14	882	May 12, 2022

JuMP.jl vs Optimization.jl

Related topics