Hello all,
I am currently trying to build a model where some of the model-parameters need to be optimized while others have to remain fixed. The current approach works rather well, but I have some trouble with auto-differentiation. It is best explained with a simplified, working example:
using ForwardDiff
using LinearAlgebra
using Optim
# Given parameters, can be arbitrarily long
parameter_values = ones(5)
# Element-wise square of the parameters, for example
function objective_function(params)
return params'*params
end
println("Evaluation: $(objective_function(parameter_values))")
solution_all = optimize(objective_function, parameter_values, LBFGS(), autodiff=:forward)
println("Solution: $(solution_all.minimizer) with minimum value $(solution_all.minimum)")
The output I am getting is desirable, since the optimizer is allowed to modify all of the parameters:
Evaluation: 5.0
Solution: [0.0, 0.0, 0.0, 0.0, 0.0] with minimum value 0.0
Now I would like to optimize only over a sub-set of parameters, namely those marked as mutable:
# User-provided bit-vector, same length as parameter_values
mutable_parameters = [false, false, false, true, true]
In order to fix some of the parameters, I create a new, modified version of the objective_function
:
# Separate fixed from mutable parameters
function separator(obj_func, selector_array, all_params, mutable_params)
all_params[selector_array] = mutable_params
return obj_func(all_params)
end
new_objective_function(params) = separator(objective_function, mutable_parameters, parameter_values, params)
This also works as intended (the first three parameters are fixed to 1.0
, the last two are optimized towards 0.0
). However, in this example I cannot apply auto-differentiation:
println("Evaluation: $(new_objective_function(parameter_values[mutable_parameters]))")
# Note: No autodiff=:forward as above
solution_selected = optimize(new_objective_function, parameter_values[mutable_parameters], LBFGS())
println("Solution: $(solution_selected.minimizer) with minimum value $(solution_selected.minimum)")
Evaluation: 5.0
Solution: [1.0504708214398306e-11, -2.6163737842921364e-11] with minimum value 3.0
Question: What do I need to change in order to get auto-differentiation to work again?
The best-case would be some tweaking to the current approach, but others with the same functionality would be also more than welcome!
Kindly,
EminentCoder
PS: If I just try to take the gradient
of the new_objective_function
over the mutable parameters, this is what happens:
julia> ForwardDiff.gradient(new_objective_function, parameter_values[mutable_parameters])
ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(new_objective_function), Float64}, Float64, 2})
The type `Float64` exists, but no method is defined for this combination of argument types when trying to construct it.
Closest candidates are:
(::Type{T})(::Real, ::RoundingMode) where T<:AbstractFloat
@ Base rounding.jl:265
(::Type{T})(::T) where T<:Number
@ Core boot.jl:900
Float64(::IrrationalConstants.Invsqrt2Ď€)
@ IrrationalConstants ~/.julia/packages/IrrationalConstants/vp5v4/src/macro.jl:112
...
Stacktrace:
[1] convert(::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(new_objective_function), Float64}, Float64, 2})
@ Base ./number.jl:7
[2] setindex!(A::Vector{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{…}, Float64, 2}, i::Int64)
@ Base ./array.jl:976
[3] macro expansion
@ ./multidimensional.jl:981 [inlined]
[4] macro expansion
@ ./cartesian.jl:64 [inlined]
[5] _unsafe_setindex!(::IndexLinear, A::Vector{…}, x::Vector{…}, I::Base.LogicalIndex{…})
@ Base ./multidimensional.jl:979
[6] _setindex!
@ ./multidimensional.jl:967 [inlined]
[7] setindex!
@ ./abstractarray.jl:1413 [inlined]
[8] separator(obj_func::typeof(objective_function), selector_array::Vector{…}, all_params::Vector{…}, mutable_params::Vector{…})
@ Main ~/continuoustimesem/diff_problem.jl:20
[9] new_objective_function(parameters::Vector{ForwardDiff.Dual{ForwardDiff.Tag{…}, Float64, 2}})
@ Main ~/continuoustimesem/diff_problem.jl:24
[10] vector_mode_dual_eval!
@ ~/.julia/packages/ForwardDiff/UBbGT/src/apiutils.jl:24 [inlined]
[11] vector_mode_gradient(f::typeof(new_objective_function), x::Vector{…}, cfg::ForwardDiff.GradientConfig{…})
@ ForwardDiff ~/.julia/packages/ForwardDiff/UBbGT/src/gradient.jl:91
[12] gradient
@ ~/.julia/packages/ForwardDiff/UBbGT/src/gradient.jl:20 [inlined]
[13] gradient(f::typeof(new_objective_function), x::Vector{…}, cfg::ForwardDiff.GradientConfig{…})
@ ForwardDiff ~/.julia/packages/ForwardDiff/UBbGT/src/gradient.jl:17
[14] gradient(f::typeof(new_objective_function), x::Vector{Float64})
@ ForwardDiff ~/.julia/packages/ForwardDiff/UBbGT/src/gradient.jl:17
[15] top-level scope
@ REPL[4]:1
Some type information was truncated. Use `show(err)` to see complete types.