I want to invert numerically a function and to compute the inverse on a ‘large’ array.
For the moment I studied only two options: using a Newton-Rhapson algorithm (I know the first derivative analytically) and using interpolation.
The function I have to invert is monotonically increasing and it is called luminosity_distance_at_z in the example below.
using BenchmarkTools
using Interpolations
using QuadGK
function NewtonRhapson(f, df, y0; x_guess=1.0, tol=1e-6, maxIter = 1000)
# found x0 : f(x0) = y0
ff(x) = f(x) - y0
x = x_guess
ffx = ff(x)
iter = 0
while abs(ffx) > tol && iter < maxIter
x -= ffx/df(x) # Iteration
ffx = ff(x) # Precompute ff(x)
iter += 1
end
return x
end
function inverse_interpolation(f, y0s)
xs = vcat( range(start = 0, stop = 9.99e-9, length = 125),
10 .^(range(start = -8, stop = log10(7.99), length = 1000)),
10 .^(range(start = log10(8), stop = 8, length = 125)) )
ys = f.(xs)
inv = interpolate((ys,), xs, Gridded(Linear()))
return inv.(y0s)
end
function E_at_z(z::Real)
sqrt(0.3*(1+z)^3 + 0.69991 + 9e-5*(1+z)^4)
end
function integral_of_E_inv(z::Real)
QuadGK.quadgk(z -> 1/E_at_z(z), 0., z)[1]
end
const c_light = 299792.458 # km/s
function comoving_radial_distance_at_z(z::Real)
dH = c_light/67
return dH * integral_of_E_inv(z)
end
function differential_luminosity_distance_at_z(z::Real)
dH = c_light/67
comoving_radial_distance_at_z(z) + (dH/E_at_z(z))*(1+z)
end
function luminosity_distance_at_z(z::Real)
comoving_radial_distance_at_z(z)*(1+z)
end
When I have to evaluate the inverse function on a single point, the NR algorithm wins.
When I have to evaluate the inverse on a ‘large’ set of points, the interpolation algorithm seems to perform better
I think that the difference is due to the fact that the interpolation method computes f(x) 1250 times (in this example), while the NR algorithm computes f(x) two times each step needed to evaluate each point of the array (and a few steps per points are usually needed).
Since the NR is more precise, are there any ways to make it as fast as the interpolation one when evaluated on a large set of points?
Since I’m new I thought that maybe I’m using wrongly the vectorization of the NR algorithm via broadcast and there could be more efficient ways to vectorize it.
I would go the opposite direction and make the interpolation more accurate.
In particular, if you have a smooth function and evaluate at Chebyshev points (not equally spaced), then polynomial interpolation is exponentially accurate—for a simple monotonic function (not too wiggly) you will probably achieve machine precision with only a handful of function evaluations. Once you have a polynomial interpolation accurate to machine precision, you can invert it easily. (e.g. you could then find the Chebyshev interpolant of the inverse function.)
Orthogonal to @stevengj answer, you may want to use Cosmology.jl instead of implementing luminosity_distance from scratch.
And Roots.jl for function inversion instead of coding your own newton methods.
Sorry but I’m not sure I have understood correctly.
To use Chebyshev interpolation as an inversion algorithm, I should first approximate the function I want to invert with a polynomial. Then, what do I have to do to invert such polynomial ‘easily’? Evaluate it on the Chebyshev points of the function co-domain and then?
Calculate the inverse on the Chebyshev points (appropriately transformed) of the codomain of the original problem. In other words, just treat the inverse like any other function you would approximate. You can calculate it with a Newton method or any other rootfinding algorithm for these points.
using FastChebInterp
f(x) = GW_distance_at_z(c,x) # the function to invert
g(x) = f(x) - dist # I want to solve g(x) = 0
yi = f(0)
yf = f(4.5)
y = chebpoints(10, yi, yf) # Compute Chebyshev points on the original co-domain
inv_inter = chebinterp(g.(y), yi, yf) # Chebyshev interpolation of g(x)
# then I have to find the roots of g(x) = 0?
If so, what is the advantage over a traditional NR algorithm and the the naive interpolation I implemented in the first post?
The assumption here is that your original function f(x) is expensive, so that a polynomial interpolant is vastly cheaper to evaluate. So, you:
Evaluate f(x) at a set of Chebyshev points to obtain an interpolating polynomial p(x). This is the only time you evaluate your expensive function f, and you only need a relatively small number of evaluations because of the exponential convergence.
In the codomain y, for a set of Chebyshev points y_k, you solve p(x) = y_k to obtain x = p^{-1}(y_k) by whatever method, e.g. Newton. Then you interpolate these p^{-1}(y_k) values with a Chebyshev polynomial q(y). This is fast because it only involves polynomial evaluations.
Your q(y) is now an approximation for f^{-1}(y). Evaluate q(y) as desired. This is fast because it only involves polynomial evaluations.
Compared to piecewise-polynomial interpolation ala Interpolations.jl, this should require many fewer function evaluations in step (1) to obtain a high-accuracy interpolant; you can easily approach the limits of roundoff errors. (This is only for smooth f that you can evaluate at arbitrary points, as is the case here. Piecewise interpolation is useful for non-smooth functions or data that is provided at a fixed set of points that you can’t change to Chebyshev points.)