Firstly, this âsmooth functionâ is something I initially encountered in a paper. It seems to be used to replace the âifelseâ function to ensure the differentiability of the expression. I attempted to calculate the derivatives using both the âsmoothâ and âifelseâ approaches separately with âForwardDiffâ:
I found that functions based on âifelseâ can also be differentiated using âForwardDiffâ. Being a beginner in this area, I started to question whether the âsmooth functionâ is necessary when dealing with situations involving âifelseâ.
Moving on to the second question, the âsmooth functionâ doesnât produce the expected results when the input approaches zero. For example:
This greatly affects my subsequent calculations and the results of differentiation. After multiple attempts, I found that adding the constant 5.0 as the first term can effectively avoid this issue. It appears that there is a significant correlation between the input value and the âsmooth functionâ. When abs(5.0*x) >> 1, the result matches expectations.
Thatâs why Iâm asking about your use case. In most cases, you donât care about the behavior near the non-differentiable point (in this case a discontinuity). ForwardDiff makes an arbitrary choice, other autodiff backends may make other choices, but without knowing what you do itâs hard to decide if these choices matter at all.
What is your âexpected resultâ near zero? It seems to approach 0.5, which is a reasonable behavior.
I have written a demo (but it does not fully reproduce the issue I encountered before), see below:
using OrdinaryDiffEq, ModelingToolkit
using ModelingToolkit: t_nounits as t, D_nounits as D
@parameters Ď Ď Î˛
@variables x(t)
step_func(v) = (tanh(β * v) + 1.0) * 0.5
eqs = [
D(x) ~ step_func(t - Ď) * Ď
]
@mtkbuild sys = ODESystem(eqs, t)
u0 = [x => 1.0]
p = [
Ď => 1.0,
Ď => 5.0,
β => 1.0]
tspan = (4.5, 5.5)
prob = ODEProblem(sys, u0, tspan, p, jac=true)
for i in [0.1, 1.0, 5.0, 10.0, 50.0]
new_prob = remake(prob, p=[β => i])
new_sol = solve(new_prob, Tsit5(), saveat=0.1)
println(vcat(new_sol.u...))
end
I keep adjusting the parameters of the smooth function, and the calculation results also show deviations.
Using the smooth function in my project may lead to even worse situations, such as the issue of ordinary differential equations having no solutions, and the calculation results include unexpected negative numbers.
What is your âexpected resultâ near zero? It seems to approach 0.5, which is a reasonable behavior.
I hope to nearly perfectly simulate the function v > 0 ? 1 : 0, but when it approaches 0, the result of the smooth function has a slight discrepancy, for example,
The standard solution to this is to âcompactifyâ the transition region of a sigmoid. The classic function you see in differential geometry is
\verb|smooth_func|(x) = \begin{cases}
0 & x \leq -1, \\
\frac{1+\tanh[x/(1-x^2)]}{2} & -1 < x < 1, \\
1 & x \geq 1.
\end{cases}
This turns out to be a smooth (infinitely differentiable, C^\infty) function. Hereâs an implementation Iâve used that I think does a pretty good job of dealing with the various roundoff/overflow issues that can arise with this function. Iâve written it for vector input, but you could convert to accept a single point if you want:
function smooth_func(x::AbstractArray, xâ, xâ, yâ=0, yâ=1)
@assert xâ ⤠xâ
y = similar(x)
xâ, xâ, yâ, yâ, m² = convert.(eltype(y), (xâ, xâ, yâ, yâ, 3))
m = âm²
Îy = (yâ - yâ) / 2
Îx = (xâ - xâ) / 2
yĚ = (yâ + yâ) / 2
xĚ = (xâ + xâ) / 2
for i â eachindex(x, y)
xĚ = (x[i] - xĚ) / Îx
# Note: These || cases are nearly redundant,
# but protect against very rare roundoff issues
if xĚ â¤ -1 || x[i] ⤠xâ
y[i] = yâ
elseif xĚ âĽ 1 || x[i] ⼠xâ
y[i] = yâ
else
y[i] = yĚ + Îy * tanh(m * xĚ / (1 - xĚ^2))
end
end
return y
end
You have to tell it the x_0 value where you want it to start transitioning and the x_1 value where you want it to stop transitioning. Optionally, you can also adjust the y values before and after the transition. Then, you can use it in the standard way:
For example, it can cause optimization algorithms to converge very slowly (e.g. because the second derivative matrix may be badly conditioned in a multi-variate problem). Or an ODE solver may converge very slowly (require tiny time steps).
Really, people are just stabbing in the dark here without knowing the application. There is no generically âcorrectâ or âgoodâ way to make a non-differentiable function differentiable, because the question itself is underspecified.
In an ODE context, itâs much much better to just put in the correct discontinuity analytically, e.g. via callbacks. There is no conceptual problem with having discontinuities or (distributional) derivatives thereof on the rhs of an ODE, and implementing this isnât too hard either.