Continuous relative/absolute residual

This is, strictly speaking, not a Julia but a math question, but I am running into this with AD and optimization.

I am solving a parametric system

f(x, \theta) = g(x, \theta)

for x given \theta. I could implement this as

function residual(x, θ)
    F = f(x, θ)
    G = g(x, θ)
    F .- G

but it may make more sense near the optimum to use a criterion like @. (F - G)/F or @. (F - G)/G. But this may fail if either F ≈ 0 or G ≈ 0.

Some texbooks recommend something like @. (F - G)/max(1, F, G), but then the derivatives are not continuous.

Is there a “standard” way of doing what I want in a continuously differentiable way? I think I could combine a softmax to get this, but thought I would ask first here.

Not sure about a standard way, but:

@. (F - G)/( 1 + abs(F) + abs(G) )

will prevent derivative discontinuities when F-G oscillates around zero which is likely to happen in such a problem.
You still get discontinuities if either F or G are oscillating around zeros, but this can be solved by creating a smoothed version of abs()

The way it’s done in differential equations is to mix them:

@. (F - G)/max(abstol + max(F,G)*reltol) < 1

Then those tolerances can of course either be scalars or arrays that match the size of F and G.

With max(F,G) you probably mean max(abs(F),abs(G)) ?

This can be modified to make its derivative continuous around F-G = 0 into:

@. (F - G)/( abstol + 0.5 * reltol * ( abs(F) + abs(G) ) )


I found this nice approximation integrating a sigmoid function to |x|, which motivates

using PGFPlotsX, StatsFuns

approx_abs(x, k) = 2/k*(log1pexp(k*x) - log(2)) - x
g(x, y) = abs(x - y) / (1 + abs(x) + abs(y)) # we approximate this
f(x, y, k) = approx_abs(x - y, k) / (1 + approx_abs(x, k) + approx_abs(y, k))

# graphs
grid = range(-10, 10; length = 1000)
@pgf Axis({ legend_pos = "south east", legend_cell_align = "left"},
          PlotInc({ no_marks }, Table(grid, g.(grid, 0.2))),
          PlotInc({ no_marks }, Table(grid, f.(grid, 0.2, 1))),
          LegendEntry("f; k = 1"),
          PlotInc({ no_marks }, Table(grid, f.(grid, 0.2, 10))),
          LegendEntry("f; k = 10"))

Now I need to think about the numerical corner cases (after a :coffee:) and whether to package this somehow, or find a home in an existing package.

1 Like

I got a bunch of smoothing functions, which could use a home too. Note, you could maybe use the hyperpola function in above.

1 Like

If I understand it correctly, the approximation for |x| would be

\sqrt{x^2 + d^2}

for d > 0. This is similar to something I saw on stackexchange, and could work very well. It is definitely simpler and easier to reason about (for me).

I will experiment and get back here.

Is this a nonlinear system? How are you solving this?

It is a highly nonlinear system. Currently solving with Optim (minimizing squared residuals), will experiment with the other nonlinear solvers.

1 Like