IPNewton (suddenly) fails on trivial example

Hi, I’ve ran into a problem where interior-point newton fails on the following easy example:

using ForwardDiff, Optim

norm_c!(c, x) = (c[1] = sum(y -> y^2, x); c)

constraints(d) = TwiceDifferentiableConstraints(norm_c!, fill(-1., d), fill(1.,d), [-Inf], [1.])

function check_optim(d)
    f = x -> sum(x)

    foc = Optim.optimize(f, constraints(d), zeros(Float64, d), IPNewton(); autodiff = :forward)
    display(foc)
end
check_optim(2)

gives

 * Status: failure (reached maximum number of iterations)

 * Candidate solution
    Final objective value:     -1.414209e+00

 * Found with
    Algorithm:     Interior Point Newton

 * Convergence measures
    |x - x'|               = 1.85e-05 ≰ 0.0e+00
    |x - x'|/|x'|          = 2.61e-05 ≰ 0.0e+00
    |f(x) - f(x')|         = 1.10e-09 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 7.77e-10 ≰ 0.0e+00
    |g(x)|                 = 1.00e+00 ≰ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    1000
    f(x) calls:    13176
    ∇f(x) calls:   13176

To explain why I wrote ‘suddenly’ in my title, let me un-XY:
I was working on a project and made a sequence of changes to my code. These touched both infrastructure and the precise nature of my optimization problem. I then noticed a sizable (x100) slowdown, which I managed to track down to many many (nearly) polynomial optimizations with convex, compact supports failing. This seemed strange, so I managed to reduce it to the above MWE, which feels very weird. In the meantime I did clean up my imports, but I’d be surprised if that is the cause of the problems. Anyway, let me list my Project.toml here:

[deps]
BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
DiffResults = "163ba53b-c6d8-5494-b064-1a9d43ac40c5"
DispatchDoctor = "8d63f2c5-f18a-4cf2-ba9d-b3f60fc568c8"
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
FastClosures = "9aa1b823-49e4-5ca5-8b0f-3971ec8bab6a"
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
GLMakie = "e9467ef8-e4e7-5192-8a1a-b1aee30e663a"
HypothesisTests = "09f84164-cd44-5f33-b23f-e6b0d136a0d5"
Interpolations = "a98d9a8b-a2ab-59e6-89dd-64a1c18fca59"
JLD2 = "033835bb-8acc-5ee8-8aae-3f567f8a3819"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
Optim = "429524aa-4258-5aef-a3af-852621145aeb"
Profile = "9abbd945-dff8-562f-b5e8-e1ebf5ef1b79"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
Revise = "295af30f-e4ad-537b-8983-00126c2a3abe"
StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"

Okay, picking a different initial point fixes the non-convergence problem (which is something that I would not expect at all in this simple example), but now the selected point is still not feasible:

using ForwardDiff, Optim, Random, LinearAlgebra
Random.seed!(42)

norm_c!(c, x) = (c[1] = sum(y -> y^2, x); c)

constraints(d) = TwiceDifferentiableConstraints(norm_c!, fill(-1., d), fill(1.,d), [-Inf], [1.])

function check_optim(d)
    f_sum = x -> ones(d) ⋅ x

    foc_sum = Optim.optimize(f_sum, constraints(d), [0.5 for _ in 1:d], IPNewton(); autodiff = :forward)
    display(foc_sum)
    display(foc_sum.minimizer)
    display(norm_c!([NaN], foc_sum.minimizer))
end
check_optim(2)

gives

 * Status: success (objective increased between iterations)

 * Candidate solution
    Final objective value:     -1.999997e+00

 * Found with
    Algorithm:     Interior Point Newton

 * Convergence measures
    |x - x'|               = 0.00e+00 ≤ 0.0e+00
    |x - x'|/|x'|          = 0.00e+00 ≤ 0.0e+00
    |f(x) - f(x')|         = 0.00e+00 ≤ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 0.00e+00 ≤ 0.0e+00
    |g(x)|                 = 1.00e+00 ≰ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    24
    f(x) calls:    77
    ∇f(x) calls:   77

2-element Vector{Float64}:
 -0.9999983923736098
 -0.9999983923736098
1-element Vector{Float64}:
 1.999993569499608

I’m very confused by this behaviour

In my recent question (Optim finite differences use a vectorized version) the following “sphere trick” was suggested to me. If I use it here, I get results that are fine:

using Random, LinearAlgebra, BenchmarkTools
using Optim, ForwardDiff
Random.seed!(42)

norm_c!(c, x) = (c[1] = sum(y -> y^2, x); c)

constraints(d) = TwiceDifferentiableConstraints(norm_c!, fill(-1., d), fill(1.,d), [-Inf], [1.], :forward)

function sphere_trick(a::Vector{T}) where {T<:Real}
    a_coords = view(a, 1:length(a)-1)
    if all(x -> x == 0, a_coords)
        return a_coords
    else
        return a_coords * a[end] / (sqrt(sum(y -> y^2, a_coords)))
    end
end
constraints_alt(d) = TwiceDifferentiableConstraints(vcat(fill(-1., d), [0.]), vcat(fill(1.,d), [1.]))

function check_optim(d)
    f_sum = x -> ones(d) ⋅ x

    cons = constraints(d)
    init = [0. for _ in 1:d]
    foc_sum = Optim.optimize(f_sum, cons, init, IPNewton(); autodiff = :forward)
    display(foc_sum)
    display(foc_sum.minimizer)
    display(norm_c!([NaN], foc_sum.minimizer))

    f_sum_alt = x -> ones(d) ⋅ sphere_trick(x)
    cons_alt = constraints_alt(d)
    init_alt = vcat(init, 0.5)
    foc_alt = Optim.optimize(f_sum_alt, cons_alt, init_alt, IPNewton(); autodiff = :forward)
    display(foc_alt)
    display(foc_alt.minimizer)
    min_true = foc_alt.minimizer[1:end-1] ./ sqrt(sum(y -> y^2, foc_alt.minimizer[1:end-1]))*foc_alt.minimizer[end]
    display(norm_c!([NaN], foc_alt.minimizer[1:end-1]))
    display(norm_c!([NaN], min_true))
end

gives

* Convergence measures
    |x - x'|               = 8.45e-14 ≰ 0.0e+00
    |x - x'|/|x'|          = 8.45e-14 ≰ 0.0e+00
    |f(x) - f(x')|         = 0.00e+00 ≤ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 0.00e+00 ≤ 0.0e+00
    |g(x)|                 = 1.41e+00 ≰ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    39
    f(x) calls:    62
    ∇f(x) calls:   62

3-element Vector{Float64}:
 -0.01523778048002576
 -0.015237780479863912
  1.0
1-element Vector{Float64}:
 0.0004643799079099757
1-element Vector{Float64}:
 1.0000000000000002

In my current problem I can probably get away with simply doing this, but it does feel a bit suboptimal if I’d have to keep doing this (as I now need to differentiate different optim calls based on the nature of the constraints, where first these were simply reflected in the TwiceDifferentiableConstraints instances…