This is, strictly speaking, not a Julia but a math question, but I am running into this with AD and optimization.
I am solving a parametric system
f(x, \theta) = g(x, \theta)
for x given \theta. I could implement this as
function residual(x, θ)
F = f(x, θ)
G = g(x, θ)
F .- G
end
but it may make more sense near the optimum to use a criterion like @. (F - G)/F or @. (F - G)/G. But this may fail if either F ≈ 0 or G ≈ 0.
Some texbooks recommend something like @. (F - G)/max(1, F, G), but then the derivatives are not continuous.
Is there a “standard” way of doing what I want in a continuously differentiable way? I think I could combine a softmax to get this, but thought I would ask first here.
will prevent derivative discontinuities when F-G oscillates around zero which is likely to happen in such a problem.
You still get discontinuities if either F or G are oscillating around zeros, but this can be solved by creating a smoothed version of abs()
If I understand it correctly, the approximation for |x| would be
\sqrt{x^2 + d^2}
for d > 0. This is similar to something I saw on stackexchange, and could work very well. It is definitely simpler and easier to reason about (for me).