I am trying to do some design optimization using a gradient-based method. For that I have written an analysis code that has a ton of parameters (some of which I want to optimize), that are stored in data structures (structs).
The problem is that I am running into type conversion errors when trying to use ForwardDiff or ReverseDiff to get gradients, because in those data structures the parameters’ types are set to Float64.
As an example consider the following. (note that this is a phenomenally stupid example, but it highlights what I am trying to do).
using ForwardDiff
struct ThinWalledCircle
t::Float64
R::Float64
end
struct ThinWalledCirclePar{T}
t::T
R::T
end
function area(circle)
return 2 * π * circle.R * circle.t
end
function compute_area_circle(t::Float64, R::Number)
circle = ThinWalledCirclePar(t, R)
# circle = ThinWalledCircle(t, R)
return area(circle)
end
compute_area_circle(1e-3, 0.5)
Rfunc = R -> compute_area_circle(1e-3, R)
ForwardDiff.derivative(Rfunc, 0.4)
With either type ThinWalledCircle or ThinWalledCirclePar I am running into errors, because either the type of t is wrong or the type of R is wrong.
The error message for the code above is
ERROR: LoadError: MethodError: no method matching ThinWalledCirclePar(::Float64, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#35#36",Float64},Float64,1})
Closest candidates are:
ThinWalledCirclePar(::T, ::T) where T at /app/test/diff_ds_example.jl:12
Stacktrace:
[1] compute_area_circle(::Float64, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#35#36",Float64},Float64,1}) at /app/test/diff_ds_example.jl:25
[2] (::var"#35#36")(::ForwardDiff.Dual{ForwardDiff.Tag{var"#35#36",Float64},Float64,1}) at /app/test/diff_ds_example.jl:33
[3] derivative(::var"#35#36", ::Float64) at /root/.julia/packages/ForwardDiff/CrVlm/src/derivative.jl:13
[4] top-level scope at /app/test/diff_ds_example.jl:35
[5] include(::String) at ./client.jl:439
[6] top-level scope at REPL[6]:1
[7] eval(::Module, ::Any) at ./boot.jl:331
[8] eval_user_input(::Any, ::REPL.REPLBackend) at /usr/local/julia-1.4.2/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:86
[9] run_backend(::REPL.REPLBackend) at /root/.julia/packages/Revise/BqeJF/src/Revise.jl:1184
[10] top-level scope at none:0
in expression starting at /app/test/diff_ds_example.jl:35
I can think of two solutions, both with drawbacks:
- Change all types from
Float64toNumber. But becauseNumberis an abstract type this is going to cost performance, right? - Make each data structure parametric on at least two types (one for the non-optimized variables and one for the optimized variables). That shouldn’t cost performance but results in a ton of (nested) parametric types meaning the code gets quite messy (and compilation takes longer?). Additional problem is that I then can’t optimize only a subset of the optimization variables, because I would again run into type errors.
Can anyone think of a better way of doing this? Or have an opinion on which approach is better (1 or 2)?