I try to optimize function with different variable vector size. And I think that for each task type method is compilated again. Main function is type-stable as I think by @code_warntype and @code_typed. And probably each tag for Dual number lead to recompilation. May be I’m wrong… But how can i solve this problem?
Please provide a minimal working example that can be copied and pasted.
2 Likes
@dpsanders , yep!
code:
using ForwardDiff
function f(x::Vector{T}) where T
x2 = x .* x
return log(sum(x2))
end
v1 = [1., 2.]
v2 = [1., 2., 3.]
@time ForwardDiff.hessian(f, v1)
@time ForwardDiff.hessian(f, v1)
@time ForwardDiff.hessian(f, v2)
@time ForwardDiff.hessian(f, v2)
out:
julia> @time ForwardDiff.hessian(f, v1)
2.000931 seconds (4.09 M allocations: 232.179 MiB, 99.94% compilation time)
2×2 Matrix{Float64}:
0.24 -0.32
-0.32 -0.24
julia>
julia> @time ForwardDiff.hessian(f, v1)
0.000030 seconds (10 allocations: 1.359 KiB)
2×2 Matrix{Float64}:
0.24 -0.32
-0.32 -0.24
julia>
julia> @time ForwardDiff.hessian(f, v2)
2.122246 seconds (4.02 M allocations: 224.059 MiB, 7.06% gc time, 99.97% compilation time)
3×3 Matrix{Float64}:
0.122449 -0.0408163 -0.0612245
-0.0408163 0.0612245 -0.122449
-0.0612245 -0.122449 -0.0408163
julia>
julia> @time ForwardDiff.hessian(f, v2)
0.000030 seconds (10 allocations: 2.828 KiB)
3×3 Matrix{Float64}:
0.122449 -0.0408163 -0.0612245
-0.0408163 0.0612245 -0.122449
-0.0612245 -0.122449 -0.0408163
julia>
first call f(x) with v2 - 2.122246 seconds
That’s because the chunk size is changing. Use a fixed chunk size if you don’t want recompilation.
2 Likes
@ChrisRackauckas , thank you!
This example works fine with Chunk{1}()
How to implement this to Optim.jl ?
Previous code:
td = TwiceDifferentiable(optfunc, θ; autodiff = :forward)
optmethod = Optim.Newton(;alphaguess = LineSearches.InitialStatic(), linesearch = LineSearches.HagerZhang())
opt = Optim.optimize(td, θ, optmethod)
I make this, it works, but it works with recompilation:
chunk = ForwardDiff.Chunk{1}()
gcfg = ForwardDiff.GradientConfig(optfunc, θ, chunk)
hcfg = ForwardDiff.HessianConfig(optfunc, θ, chunk)
gfunc!(g, x) = ForwardDiff.gradient!(g, optfunc, x, gcfg)
hfunc!(h, x) = ForwardDiff.hessian!(h, optfunc, x, hcfg)
td = TwiceDifferentiable(optfunc, gfunc!, hfunc!, θ)
optmethod = Optim.Newton(;alphaguess = LineSearches.InitialStatic(), linesearch = LineSearches.HagerZhang())
opt = Optim.optimize(td, θ, optmethod)
Edit: fix.