IPNewton and ForwardDiff with preallocated Arrays

Hi !

I am new here so please let me know if I post at the right place. I could not find that issue on other threads.

I need to solve a constrained optimization problem, and I’d like to use Optim.optimize with the IPNewton method. The cost function makes use of many arrays so I would like to preallocate all of them. I also want to use automatic differentiation.

The preallocated arrays need to have the right type (which seems to be

ForwardDiff.Dual{ForwardDiff.Tag{var"#J#409"{Array{Float64,1}},Float64},Float64,5}

in my case), but the optimize function fails.

Here is a toy script that gives exactly the same error as mine. It is just looking for square roots of an array of Floats.

using Optim, ForwardDiff

function test()
  data = [1.0, 2.0, 3.0, 4.0, 5.0]
  U = fill(1.0, length(data))
  function J(U)
  #makes use of a preallocated array called aux
    for i in 1:length(U)
      aux[i] = U[i]^2
    end
    sum((aux .- data).^2)
  end
  cfg = ForwardDiff.GradientConfig(J, U)
  g! = (grad, x) -> ForwardDiff.gradient!(grad, J, x, cfg)
  h! = (hess, x) -> ForwardDiff.hessian!(hess, J, x)
  type = eltype(cfg)
  aux = ones(type, length(data)) #here it is
  df = TwiceDifferentiable(J, g!, h!, U)
  #lower = fill(zero(type), length(data))
  lower = fill(zero(eltype(U)), length(data))
  upper = fill(Inf, length(data))
  dfc = TwiceDifferentiableConstraints(lower, upper)
  opt = optimize(df, dfc, U, IPNewton())
  return opt
end

test()

And here is the error I get :

ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#J#419"{Array{Float64,1}},Float64},Float64,5})
Closest candidates are:
  Float64(::Real, ::RoundingMode) where T<:AbstractFloat at rounding.jl:200
  Float64(::T) where T<:Number at boot.jl:716
  Float64(::Float32) at float.jl:255
  ...
Stacktrace:
 [1] convert(::Type{Float64}, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#J#419"{Array{Float64,1}},Float64},Float64,5}) at ./number.jl:7
 [2] setproperty!(::TwiceDifferentiable{Float64,Array{Float64,1},Array{Float64,2},Array{Float64,1}}, ::Symbol, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#J#419"{Array{Float64,1}},Float64},Float64,5}) at ./Base.jl:34
 [3] value_gradient!!(::TwiceDifferentiable{Float64,Array{Float64,1},Array{Float64,2},Array{Float64,1}}, ::Array{Float64,1}) at /Users/sylvainmoi/.julia/packages/NLSolversBase/QPnui/src/interface.jl:82
 [4] value_gradient! at /Users/sylvainmoi/.julia/packages/NLSolversBase/QPnui/src/interface.jl:69 [inlined]
 [5] initial_state(::IPNewton{typeof(Optim.backtrack_constrained_grad),Symbol}, ::Optim.Options{Float64,Nothing}, ::TwiceDifferentiable{Float64,Array{Float64,1},Array{Float64,2},Array{Float64,1}}, ::TwiceDifferentiableConstraints{NLSolversBase.var"#131#134",NLSolversBase.var"#132#135",NLSolversBase.var"#133#136",Float64}, ::Array{Float64,1}) at /Users/sylvainmoi/.julia/packages/Optim/Yd5Zq/src/multivariate/solvers/constrained/ipnewton/ipnewton.jl:125
 [6] optimize(::TwiceDifferentiable{Float64,Array{Float64,1},Array{Float64,2},Array{Float64,1}}, ::TwiceDifferentiableConstraints{NLSolversBase.var"#131#134",NLSolversBase.var"#132#135",NLSolversBase.var"#133#136",Float64}, ::Array{Float64,1}, ::IPNewton{typeof(Optim.backtrack_constrained_grad),Symbol}, ::Optim.Options{Float64,Nothing}) at /Users/sylvainmoi/.julia/packages/Optim/Yd5Zq/src/multivariate/solvers/constrained/ipnewton/interior.jl:228 (repeats 2 times)
 [7] test() at /Users/sylvainmoi/Documents/These/Optim_Julia/cas_ecole.jl:24
 [8] top-level scope at none:1

It has to do with the function Optim.value_gradient!. If I change

aux = ones(type, length(data))

to

aux = ones(Real, length(data))

it works but it’s useless, because each

aux[i] = U[i]^2

in the for-loop needs then to allocate every time.

Thanks a lot if you can have a look at that.

Cheers,

You have the right idea of using eltype when you want to pre-allocate things. But I think the problem is that aux needs to be a different type for different calls. For example, calling J with the initial vector to get the objective value means that you want aux to be a Vector{Float64}, but for the AD calls you’ll want it to be a Vector{Dual{T,V,N}} where{T,V,N}, for example. The reason it works for Real, an abstract type, is because both Float64 and Dual are subtypes of Real.

I could be incorrect, but it may be easiest to refactor your problem a bit. Presumably your real application actually does need aux, but in this case you don’t actually need it, and a similar idea of this refactor and moving of J outside test might also work for your real problem.

using Optim, ForwardDiff

# Probably cleaner to make this a separate function that takes more arguments
# and create a closure in test.
_J(U, data) = sum(z->abs2(z[1]^2-z[2]), zip(U, data))

function test()
  data = [1.0, 2.0, 3.0, 4.0, 5.0]
  U    = ones(length(data))
  obj  = x -> _J(x, data)
  cfg = ForwardDiff.GradientConfig(obj, U)
  g! = (grad, x) -> ForwardDiff.gradient!(grad, obj, x, cfg)
  h! = (hess, x) -> ForwardDiff.hessian!(hess, obj, x)
  df = TwiceDifferentiable(obj, g!, h!, U)
  lower = fill(0.0, length(data))
  upper = fill(Inf, length(data))
  dfc = TwiceDifferentiableConstraints(lower, upper)
  opt = optimize(df, dfc, U, IPNewton())
  return opt
end

result = test()
1 Like

Thank you for your answer !

I see indeed a type problem : inside the function test() above, I tried to add

  ...
  aux = ones(type, length(data))

  println(J(U)) #works
  println(J(aux)) #works
  println(ForwardDiff.gradient(J, U)) #works
  println(ForwardDiff.gradient(J, aux)) #reports MethodError
  ...

I’m a bit confused because I can read in the ForwardDiff doc that the x argument of ForwardDiff.gradient must be of type x::AbstractArray and I assumed that Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#J#354"{Array{Float64,1}},Float64},Float64,5},1} (ie aux type) would match with AbstractArray.

I guess you are right. I don’t absolutely need to use preallocated arrays in my problem but I hoped they could speed up my code. I take note of your advice to write cleaner code.