using Optimization, OptimizationOptimJL, LinearAlgebra
K = rand(10, 10)
g = rand(10)
α = 1
function sL1(x::AbstractVector{T}, mu=1.e-7) where {T}
snrm = sum(Cabs.(x, mu))
return snrm
end
function Cabs(x, mu)
p = 4.0 * mu * mu
Cabs = sqrt(x * x + p)
return Cabs
end
function G(f, p)
# u = f, p = [K,s,α]
k = p[1]
s = p[2]
α = p[3]
r = K * f - s
return r' * r + α * sL1(f, 1.e-7)
end
optf = Optimization.OptimizationFunction(G, Optimization.AutoForwardDiff())
prob = Optimization.OptimizationProblem(optf, ones(size(K, 2)),
(K, g, α),
lb=zeros(size(K, 2)), ub=[Inf for i in 1:size(K, 2)])
f = OptimizationOptimJL.solve(prob, OptimizationOptimJL.BFGS())
It does converge to something, but it’s pretty slow compared to the other methods. Perhaps the way I’ve written it is far from optimal.
I assume we can remove the r allocations from the cost function, but I’m not sure what would be the best way to do it.
It seems you have precomputed K and b, so the allocations are not coming from that.
If something else is doing what you want and is fast enough, you should probably use that. You can invest a lot of time in the Optimization.jl docs before you figure this problem out.
I am not aware of algorithms that do constrained PDHG, so you would have to check with a cost that is pus infinity for negative input and check what it’s pros (and grad) are in the PDHG setting are; one rephrasing could be to check out to work on the set of positive numbers (as a manifold), but I am super biased there, since the Chambolle-Pock I linked can (among others) do exactly that (and I implemented that). You would still need special probes and gradients then.