Fixing a parameter with Optimization.jl

Suppose I define the following problem which computes the maximum likelihood estimates (MLEs) for a regression problem using Optimization.jl:

using Optimization, OptimizationOptimJL, Random, Distributions
β₀, β₁, β₂, β₃, n, σ = -1.0, 1.0, 0.5, 3.0, 300, 0.1
Random.seed!(12301)
x₁ = rand(Uniform(-1, 1), n)
x₂ = rand(Normal(1.0, 0.5), n)
ε = rand(Normal(0.0, σ), n)
y = @. β₀ + β₁ * x₁ + β₂ * x₂ + β₃ * x₁ * x₂ + ε
function loglik(θ, p)
    β = @views θ[1:4]
    σ² = θ[5]
    y, x₁, x₂ = p
    ℓℓ = -n / 2 * log(2π * σ²)
    for i in 1:length(y)
        ℓℓ = ℓℓ - 0.5 / σ² * (y[i] - β[1] - β[2] * x₁[i] - β[3] * x₂[i] - β[4] * x₁[i] * x₂[i])^2
    end
    return ℓℓ
end
func = OptimizationFunction((u, p) -> -loglik(u, p), Optimization.AutoForwardDiff())
prob = OptimizationProblem(func, ones(5), (y, x₁, x₂); lb = [-Inf,-Inf,-Inf,-Inf,0.0],ub=[Inf,Inf,Inf,Inf,Inf])
solve(prob, NelderMead())

Is there an easy way to extend the existing OptimizationProblem, prob, to fix a certain parameter? I’d want to fix say σ and optimise for the other parameters, over a large range of σ so it would have to be reasonably efficient. (This is computing a profile likelihood.) This would be done for many parameters, and the computations that will follow it will require the MLE for the full problem, so I do need this full prob to start with.

I thought about just fixing lb and ub for this parameter so that lb = ub = the fixed value of σ, where I’d use remake to make this change to prob, but this seems like it could cause problems. Any ideas?

Do (u, p) -> -loglik(u, p, σ²) and enclose it, then just optimize over 4 values.

1 Like

Thank you @ChrisRackauckas. Does that not cause significant overhead if this is done many times since OptimizationFunction and OptimizationProblem have to be reconstructed everytime? Maybe it’s unavoidable anyway for this type of thing.

The value is fixed right? So the optimization should be able to run without constructing any now OptimizationX

1 Like

Sorry, by “many times” I mean doing this same thing for say σ² ∈ 0.01:0.01:1.00, so that a new problem is constructed for each individual σ² and then optimised, and same for all the next values (so 100 problems in this example).

ParameterHandling.jl is a fantastic package for managing optimization parameters. It allows you to assign bounds and fix parameters, among other things. It even automatically handles scaling parameters. I have used it alongside Optimization.jl to great effect.

3 Likes

This package looks very useful, I’ll definitely look into it. Do you have any good examples of how you’ve used it alongside Optimization.jl? If you don’t mind sharing; no problem otherwise.

1 Like

Sure, here’s a version of your example using ParameterHandling:

using ParameterHandling
using Optimization
using OptimizationOptimJL
using Random
using Distributions

β₀, β₁, β₂, β₃, n, σ = -1.0, 1.0, 0.5, 3.0, 300, 0.1
Random.seed!(12301)
x₁ = rand(Uniform(-1, 1), n)
x₂ = rand(Normal(1.0, 0.5), n)
ε = rand(Normal(0.0, σ), n)
y = @. β₀ + β₁ * x₁ + β₂ * x₂ + β₃ * x₁ * x₂ + ε

params = (σ² = bounded(1.0, 0.0, 1e9), β = ones(4)) # β doesn't have bounds, just initial value
vals, unflatten = value_flatten(params)
function loglik(θ, data)
    (; σ², β) = θ
    y, x₁, x₂ = data
    ℓℓ = -n / 2 * log(2π * σ²)
    for i in 1:length(y)
        ℓℓ = ℓℓ - 0.5 / σ² * (y[i] - β[1] - β[2] * x₁[i] - β[3] * x₂[i] - β[4] * x₁[i] * x₂[i])^2
    end
    return ℓℓ
end
func = OptimizationFunction((θ, data) -> -loglik(unflatten(θ), data), Optimization.AutoForwardDiff())
prob = OptimizationProblem(func, vals, (y, x₁, x₂))
θhat = solve(prob, NelderMead())
soln = unflatten(collect(θhat))

# Fix σ²:
params = (σ² = fixed(1.0), β = ones(4))
vals, unflatten = value_flatten(params)
func = OptimizationFunction((θ, data) -> -loglik(unflatten(θ), data), Optimization.AutoForwardDiff())
prob = OptimizationProblem(func, vals, (y, x₁, x₂))
θhat = solve(prob, NelderMead())
soln = unflatten(collect(θhat))

Note that I prefer to call the second argument in the objective function data and keep the term params for parameters like θ to be estimated in the statistical sense.

1 Like

Very slick, thanks so much for that. Can I just ask what is the purpose of this quoted code? I would just write σ², β = θ (as you do for the line y, x₁, x₂ = data - is there a difference?

Sure, the (; σ², β) = θ syntax does multiple assignment by fieldnames. This can be useful for unpacking NamedTuples like I have above.

In contrast, σ², β = θ only unpacks by order, and can be used even for non-named objects. This is a bit more fragile. I prefer to use names instead of relying on order when possible.

edit: I couldn’t find anything in the docs about (; σ², β) = θ, but it’s the first new language feature listed in the 1.7 release notes: Julia v1.8 Release Notes · The Julia Language

1 Like