How to smooth the if function

chooron · May 8, 2024, 2:20am

Is there currently a library that provides a smoothing function to ensure the differentiability of the judgment function?

For example, I have a simple formula here:
f(x) = x > 1 ? x : x^2

Is there a smooth function here so that the following formula can obtain approximate results?

f_smooth(x) = smooth_func(x-1)*x+smooth_func(1-x)*x^2

Currently I have built a smooth_func, see below:

smooth_func(x) = (tanh(5.0 * x) + 1.0) * 0.5

gdalle · May 8, 2024, 4:31am

There are lots of ways to smooth the ifelse function, so the first question is why do you need it? And why is the one you found not good enough?

Raf · May 8, 2024, 7:02am

This may help:

chooron · May 8, 2024, 7:07am

Thank you for your response.

Firstly, this “smooth function” is something I initially encountered in a paper. It seems to be used to replace the “ifelse” function to ensure the differentiability of the expression. I attempted to calculate the derivatives using both the “smooth” and “ifelse” approaches separately with “ForwardDiff”:

using ForwardDiff

f1(x) = x > 1 ? x^3 : x^2
smooth_func(x) = (tanh(5.0 * x) + 1.0) * 0.5
f2(x) = smooth_func(x-1) * x + smooth_func(1-x) * x^2

ForwardDiff.derivative(f1, 0.1)
ForwardDiff.derivative(f2, 0.1)

I found that functions based on “ifelse” can also be differentiated using “ForwardDiff”. Being a beginner in this area, I started to question whether the “smooth function” is necessary when dealing with situations involving “ifelse”.

Moving on to the second question, the “smooth function” doesn’t produce the expected results when the input approaches zero. For example:

julia> smooth_func(1e-6)
0.5000025

julia> smooth_func(-1e-6)
0.49999750000000004

julia> smooth_func(1e-2)
0.52497918747894

julia> smooth_func(1e-3)
0.5024999791668749

julia> smooth_func(1e-1)
0.7310585786300049

This greatly affects my subsequent calculations and the results of differentiation. After multiple attempts, I found that adding the constant 5.0 as the first term can effectively avoid this issue. It appears that there is a significant correlation between the input value and the “smooth function”. When abs(5.0*x) >> 1, the result matches expectations.

chooron · May 8, 2024, 7:13am

Thank you, this seems to be what I’m looking for. I’ll go take a look.

gdalle · May 8, 2024, 8:45am

That’s why I’m asking about your use case. In most cases, you don’t care about the behavior near the non-differentiable point (in this case a discontinuity). ForwardDiff makes an arbitrary choice, other autodiff backends may make other choices, but without knowing what you do it’s hard to decide if these choices matter at all.

What is your “expected result” near zero? It seems to approach 0.5, which is a reasonable behavior.

Can you provide a complete example?

chooron · May 8, 2024, 9:43am

I have written a demo (but it does not fully reproduce the issue I encountered before), see below:

using OrdinaryDiffEq, ModelingToolkit
using ModelingToolkit: t_nounits as t, D_nounits as D

@parameters σ ρ β
@variables x(t)

step_func(v) = (tanh(β * v) + 1.0) * 0.5

eqs = [
    D(x) ~ step_func(t - ρ) * σ
]

@mtkbuild sys = ODESystem(eqs, t)

u0 = [x => 1.0]

p = [
    σ => 1.0,
    ρ => 5.0,
    β => 1.0]

tspan = (4.5, 5.5)
prob = ODEProblem(sys, u0, tspan, p, jac=true)

for i in [0.1, 1.0, 5.0, 10.0, 50.0]
    new_prob = remake(prob, p=[β => i])
    new_sol = solve(new_prob, Tsit5(), saveat=0.1)
    println(vcat(new_sol.u...))
end

I keep adjusting the parameters of the smooth function, and the calculation results also show deviations.
Using the smooth function in my project may lead to even worse situations, such as the issue of ordinary differential equations having no solutions, and the calculation results include unexpected negative numbers.

What is your “expected result” near zero? It seems to approach 0.5, which is a reasonable behavior.

I hope to nearly perfectly simulate the function v > 0 ? 1 : 0, but when it approaches 0, the result of the smooth function has a slight discrepancy, for example,

(tanh(5.0 * 0.1) + 1.0) * 0.5 = 0.731058
(tanh(5.0 * -0.1) + 1.0) * 0.5 = 0.268941

Dictino · May 9, 2024, 8:31pm

Just increase the “gain”

(tanh(50.0 * 0.1) + 1.0) * 0.5 = 0.99995460213

(tanh(50.0 * -0.1) + 1.0) * 0.5=0.00004539786

But It have side efrects since the derivative at 0 will increase and It could be a problem, it deppends on your exact application…

chooron · May 10, 2024, 1:21am

Yes, improving gain can indeed prevent this issue.

moble · May 15, 2024, 5:49pm

The standard solution to this is to “compactify” the transition region of a sigmoid. The classic function you see in differential geometry is

\verb|smooth_func|(x) = \begin{cases} 0 & x \leq -1, \\ \frac{1+\tanh[x/(1-x^2)]}{2} & -1 < x < 1, \\ 1 & x \geq 1. \end{cases}

This turns out to be a smooth (infinitely differentiable, C^\infty) function. Here’s an implementation I’ve used that I think does a pretty good job of dealing with the various roundoff/overflow issues that can arise with this function. I’ve written it for vector input, but you could convert to accept a single point if you want:

function smooth_func(x::AbstractArray, x₀, x₁, y₀=0, y₁=1)
    @assert x₀ ≤ x₁
    y = similar(x)
    x₀, x₁, y₀, y₁, m² = convert.(eltype(y), (x₀, x₁, y₀, y₁, 3))
    m = √m²
    Δy = (y₁ - y₀) / 2
    Δx = (x₁ - x₀) / 2
    ȳ = (y₀ + y₁) / 2
    x̄ = (x₀ + x₁) / 2
    for i ∈ eachindex(x, y)
        x̂ = (x[i] - x̄) / Δx
        # Note: These || cases are nearly redundant,
        # but protect against very rare roundoff issues
        if x̂ ≤ -1 || x[i] ≤ x₀
            y[i] = y₀
        elseif x̂ ≥ 1 || x[i] ≥ x₁
            y[i] = y₁
        else
            y[i] = ȳ + Δy * tanh(m * x̂ / (1 - x̂^2))
        end
    end
    return y
end

You have to tell it the x_0 value where you want it to start transitioning and the x_1 value where you want it to stop transitioning. Optionally, you can also adjust the y values before and after the transition. Then, you can use it in the standard way:

x0 = -1
x1 = 1
τ = smooth_func(x, x0, x1)
f_smooth = (1-τ) * x + τ * x^2

stevengj · May 15, 2024, 7:19pm

For example, it can cause optimization algorithms to converge very slowly (e.g. because the second derivative matrix may be badly conditioned in a multi-variate problem). Or an ODE solver may converge very slowly (require tiny time steps).

Really, people are just stabbing in the dark here without knowing the application. There is no generically “correct” or “good” way to make a non-differentiable function differentiable, because the question itself is underspecified.

stevengj · May 15, 2024, 7:28pm

In an ODE context, it’s much much better to just put in the correct discontinuity analytically, e.g. via callbacks. There is no conceptual problem with having discontinuities or (distributional) derivatives thereof on the rhs of an ODE, and implementing this isn’t too hard either.

Topic		Replies	Views
Differentiable vectorized piecewise function in Julia? General Usage	6	2379	December 29, 2021
Finite difference approximation using Calculus, getting the values f(x),f(x+e),f(x-e) Numerics differentiation	1	829	September 3, 2017
Derivative of linear function New to Julia	5	875	September 23, 2021
Package to work with piecewise functions? General Usage	2	517	March 2, 2023
ForwardDiff & "complicated" functions? General Usage	14	2282	November 16, 2019

How to smooth the if function

Related topics