Interpolations inaccuracy

OlafTeichert · October 10, 2020, 10:20am

While moving code from python, I noticed that linear interpolation in Julia is less accurate than the python numpy interp function, see MWE below.

The error may seem tiny, but the value is integrated and adds up over time. Other than that, I’m also curious, why julia is less accurate than the python numpy interp function here.

Any ideas what might cause this or how to fix it?

using Interpolations


#Interpolation function
xp = [0.0,3600.0]
yp = [25.7,26]

itp = LinearInterpolation(xp,yp)
result = itp(3240)

#Manual check (y = ax+b)
a = (26-25.7)/3600
b = 25.7
result_check = 3240a+b

int_error = abs(result-result_check)

println("The interpolation error, $int_error, is larger than eps, $(eps())")

in python:

np.interp(3240,[0, 3600],[25.7, 26])

tamasgal · October 10, 2020, 10:41am

First, I think you should rather write: the linear interpolation in numpy yields different errors compared to the Interpolations.jl package. Interpolations are neither part of the Julia, nor Python.

Second: given your example, I get:

The interpolation error, 3.552713678800501e-15, is larger than eps, 2.220446049250313e-16

This error is extremely tiny and I am wondering what Python (numpy) yields (have not tried it yet, I was too lazy).

Lastly:

The error may seem tiny, but the value is integrated and adds up over time.

The error is extremely tiny and I am wondering what you do, so that these errors really “add up” over time, meaning that you get an offset to be worried about. In this case, it is much more likely that you have some fluctuations around the exact value due to floating point arithmethics and even if you integrate a curve, these errors should fluctuate around the truth.

Do you have an actual use-case where you can show that this is really a problem?

Btw. interpolating between two points is also a bit far from an actual use-case

tim.holy · October 10, 2020, 11:49am

Floating point numbers don’t have uniform absolute precision; a sensible comparison would be against

julia> eps(result_check)
3.552713678800501e-15

which shows that Interpolations.jl differs by 1ulp. (You can also use nextfloat(result_check) to verify this.)

Another way you can do the same calculation:

julia> f = 3240/3600
0.9

julia> (1-f)*yp[1] + f*yp[2]
25.970000000000002

which is the Interpolations.jl result. Which one is correct? Well, fma likes the Interpolations result:

julia> fma(1-f, yp[1], fma(f, yp[2], 0.0))
25.970000000000002

but taking the bounds at face value prefers the numpy:

julia> f = BigFloat(3240)/3600
0.8999999999999999999999999999999999999999999999999999999999999999999999999999965

julia> Float64((1-f)*yp[1] + f*yp[2])
25.97

ffevotte · October 10, 2020, 11:52am

Please remember that round-off errors are bounded in relative accuracy. I.e. you must either compute a relative error and compare that to eps(), or compute an absolute error and compare that to eps(expected_result)

Everything is very fine here, because:

julia> itp(3240) == nextfloat(result_check)
true

which means that what you expected and what you actually got are only one ulp apart.

But more importantly, if result_check is computed with double precision, why would you expect it to be sufficiently accurate to measure the “error” of another result computed in double-precision arithmetic?

If you actually want to assess the accuracy of a double-precision result, you have to use something else (ideally guaranteed to be more accurate) in order to get a reference. For example, BigFloat:

julia> biga = (BigFloat(26)-BigFloat(25.7))/BigFloat(3600)
8.333333333333353070631548891671829753451877170138888888888888888888888888888838e-05

julia> big_check = 3240biga + b
25.96999999999999992894572642398998141288757324218749999999999999999999999999989

julia> abs(result-big_check)/big_check
9.576047651753371803439068281745712155163492344466877751845994513632762834561385e-17

julia> abs(result-result_check)/big_check
1.368006807393339e-16

So it appears that the result LinearInterpolation gives you is actually more accurate than what you considered to be a reference result!

EDIT: … and probably (didn’t check) more accurate too than numpy

GunnarFarneback · October 10, 2020, 3:32pm

No, it’s not.

Both Interpolations and numpy are within one ulp of the correct result. It’s not very useful to compare one anecdotal data point though, so let’s check more interpolation points.

using Interpolations, PyCall, Statistics

xp = [0.0,3600.0]
yp = [25.7,26]

itp = LinearInterpolation(xp,yp)
results_itp = itp.(0:3600)

np = pyimport("numpy")
results_np = np.interp(0:3600, [0, 3600], [25.7, 26])

results_big = (0:3600) .* ((big(26) - big(25.7)) / big(3600)) .+ big(25.7)

errors_itp = abs.((results_itp .- results_big) ./ results_big)
errors_np = abs.((results_np .- results_big) ./ results_big)
errors_ideal = abs.((Float64.(results_big) .- results_big) ./ results_big)

print_result(s, x) = println(s,
                             round(Float64(maximum(x) / eps(Float64)), digits=5), " ",
                             round(Float64(mean(x) / eps(Float64)), digits=5))

println("                max     mean")
print_result("Interpolations: ", errors_itp)
print_result("Numpy:          ", errors_np)
print_result("Ideal:          ", errors_ideal)

with the results

                max     mean
Interpolations: 0.91171 0.23339
Numpy:          0.31465 0.15472
Ideal:          0.31056 0.1547

We can see that numpy is clearly doing better in the sub-ulp area than Interpolations and is in fact very close to the best possible Float64 results.

I can see 3 possible explanations:

Numpy was lucky with the sampled interpolation points (very unlikely).
Numpy’s algorithm has been carefully designed to give the best possible accuracy.
Numpy is actually computing in higher precision internally.

To get any further would require benchmarking the speed of the interpolations and/or inspecting the relevant source code, but I leave that to more curious people.

tim.holy · October 11, 2020, 8:18am

You do get the numpy answer if you do the division last. With

xp = [0.0,3600.0]
yp = [25.7,26]
x = 3240
f = x/(xp[2]-xp[1])

instead of

julia> (1-f)*yp[1] + f*yp[2]
25.970000000000002

do

julia> ((xp[2] - x)*yp[1] + (x - xp[1])*yp[2])/(xp[2] - xp[1])
25.97

If all of these were integers, though, you’d be vulnerable to overflow:

xp = [0, 3600]
yp = [typemax(Int), typemax(Int)]

julia> ((xp[2] - x)*yp[1] + (x - xp[1])*yp[2])/(xp[2] - xp[1])
-1.0

but the f version is not:

julia> f = x/(xp[2]-xp[1])
0.9

julia> (1-f)*yp[1] + f*yp[2]
9.223372036854776e18

numpy clearly converts things to Float64 first, so it prevents the overflow, but this also destroys the ability to do generic computation—for example, you can’t autodiff through a computation like that. So in Julia we would definitely not want to coerce to Float64 first.

On balance I suspect that despite the slightly larger inaccuracy the Interpolations.jl approach is actually the more useful of the two.

OlafTeichert · October 12, 2020, 10:17am

Thank you for the fast and detailed answers!

I must admit that I was too quick to blame the difference in interpolation errors for my problem. By now, I found the bug that caused my python code to produce a different result from my Julia code.

Thank you anyway for the interesting insights!

Topic		Replies	Views
Using Interpolations New to Julia interpolations	47	19307	October 5, 2020
LinearInterpolation between two points General Usage interpolations	5	1251	September 11, 2021
How to speed up the numerical integration with interpolation Performance question , interpolations , integral	7	1487	March 17, 2023
Fast Interpolations Performance	15	2310	August 20, 2018
Help with poor LinearInterpolation performance General Usage	10	1185	March 26, 2019

Interpolations inaccuracy

Related topics