Hi there, this post is about a toy example that explores whether and how automatic differentiation (AD) can be used to determine Lagrange multipliers.
Setup
Take the following toy problem
where f(\varepsilon) is the optimal value function. A bit of manipulation (and standard texbook results) yield
where \lambda is the Lagrange multiplier w.r.t. the equality constraint.
Code that works
The following piece of code works just fine
using Zygote
objective_function(x) = sum(x.^2)
function kkt_solution(ε)
[2 0 1; 0 2 1; 1 1 0] \ [0; 0; ε]
end
function optimal_value(ε)
z = kkt_solution(ε)
objective_function(z[1:2])
end
lambda_value(ε) = kkt_solution(ε)[3]
julia> ε = 1.23
1.23
julia> optimal_value(ε) == ε^2/2
true
julia> optimal_value'(ε) == ε == -lambda_value(ε)
true
Code that does not work
So far, so good. Usually, the KKT system is not a simple linear system of equations, but a generic system of nonlinear algebraic equations
Of course, there are Julia packages that are able to solve these. But can we use these in conjunction with AD?
using Zygote, NLsolve
objective_function(x) = sum(x.^2)
function optimal_value_nlsolve(ε)
function f!(F, z)
F[1:3] = [2 0 1; 0 2 1; 1 1 0]*z - [0; 0; ε]
end
res = nlsolve(f!, ε*ones(3))
objective_function(res.zero[1:2])
end
julia> ε = 1.23
1.23
julia> isapprox(optimal_value_nlsolve(ε), ε^2/2)
true
julia> optimal_value_nlsolve'(ε)
ERROR: TypeError: in ccall: first argument not a pointer or valid constant expression, expected Ptr, got Tuple{Symbol,String}
Question
I am fully aware that what I’m asking for is a lot (really, a lot!), but: is it possible to use AD together with generic iterative methods? If so, what’s the way to go?
Thanks heaps!