ForwardDiff.jl \ Optim.jl and recompilation for any vector size

I try to optimize function with different variable vector size. And I think that for each task type method is compilated again. Main function is type-stable as I think by @code_warntype and @code_typed. And probably each tag for Dual number lead to recompilation. May be I’m wrong… But how can i solve this problem?

Please provide a minimal working example that can be copied and pasted.

2 Likes

@dpsanders , yep!

code:

using ForwardDiff
function f(x::Vector{T}) where T
    x2 = x .* x
    return log(sum(x2))
end
v1 = [1., 2.]
v2 = [1., 2., 3.]
@time ForwardDiff.hessian(f, v1)
@time ForwardDiff.hessian(f, v1)
@time ForwardDiff.hessian(f, v2)
@time ForwardDiff.hessian(f, v2)

out:

julia> @time ForwardDiff.hessian(f, v1)
  2.000931 seconds (4.09 M allocations: 232.179 MiB, 99.94% compilation time)
2×2 Matrix{Float64}:
  0.24  -0.32
 -0.32  -0.24

julia> 

julia> @time ForwardDiff.hessian(f, v1)
  0.000030 seconds (10 allocations: 1.359 KiB)
2×2 Matrix{Float64}:
  0.24  -0.32
 -0.32  -0.24

julia> 

julia> @time ForwardDiff.hessian(f, v2)
  2.122246 seconds (4.02 M allocations: 224.059 MiB, 7.06% gc time, 99.97% compilation time)
3×3 Matrix{Float64}:
  0.122449   -0.0408163  -0.0612245
 -0.0408163   0.0612245  -0.122449
 -0.0612245  -0.122449   -0.0408163

julia> 

julia> @time ForwardDiff.hessian(f, v2)
  0.000030 seconds (10 allocations: 2.828 KiB)
3×3 Matrix{Float64}:
  0.122449   -0.0408163  -0.0612245
 -0.0408163   0.0612245  -0.122449
 -0.0612245  -0.122449   -0.0408163

julia> 

first call f(x) with v2 - 2.122246 seconds

That’s because the chunk size is changing. Use a fixed chunk size if you don’t want recompilation.

2 Likes

@ChrisRackauckas , thank you!

This example works fine with Chunk{1}()

How to implement this to Optim.jl ?

Previous code:

td = TwiceDifferentiable(optfunc, θ; autodiff = :forward)
optmethod  = Optim.Newton(;alphaguess = LineSearches.InitialStatic(), linesearch = LineSearches.HagerZhang())
        opt  = Optim.optimize(td, θ, optmethod)

I make this, it works, but it works with recompilation:

chunk  = ForwardDiff.Chunk{1}()
    gcfg   = ForwardDiff.GradientConfig(optfunc, θ, chunk)
    hcfg   = ForwardDiff.HessianConfig(optfunc, θ, chunk)
    gfunc!(g, x) = ForwardDiff.gradient!(g, optfunc, x, gcfg)
    hfunc!(h, x) = ForwardDiff.hessian!(h, optfunc, x, hcfg)
    td = TwiceDifferentiable(optfunc, gfunc!, hfunc!, θ)
optmethod  = Optim.Newton(;alphaguess = LineSearches.InitialStatic(), linesearch = LineSearches.HagerZhang())
        opt  = Optim.optimize(td, θ, optmethod)

Edit: fix.