Weird behavior of range with .-

I came around the following behavior using range I did not expect:

creating the range specified below and subtracting its last element results in a range with a fairly wide difference to 0 at the last element.

a = range(0, 100, length=50)
0.0:2.0408163265306123:100.0

julia> b = 100.
100.0

julia> a .- b
-100.0:2.0408163265306123:-5.048709793414476e-29

julia> a[end] - b
0.0

this behavior is also true for a .- a[end]. However

julia> c = collect(range(0, 100, length=50))

julia> (c .- b)[end]
0.0

gives 0.0 as I would expect.

Does anybody know were this difference come from? I’m using version 1.5.0 on Linux

looks like it boils down to this line:
https://github.com/JuliaLang/julia/blob/41e603e5836bebf447f14c82fb2671b3567fc3bf/base/broadcast.jl#L1075

and running it manually:

julia> a = range(0, 100, length=50)
0.0:2.0408163265306123:100.0

julia> StepRangeLen{Float64}(a.ref - 100., a.step, length(a), a.offset)
-100.0:2.0408163265306123:-5.048709793414476e-29

my guess is that it’s because a.step is a TwicePrecision and some floating point number behavior happened along the way.

2 Likes

You might be able to avoid this kind of issue if you did something like

a=collect(0:.5:100)

this problem would likely go away. I admit that this would make your array have length 51 instead of 50.

Well, the problem is: when what you expect is 0., any relative difference is “fairly wide”.

What happens here is a cancellation, i.e. the subtraction of two numbers that are very close to one another, resulting in a large increase of the relative error (but substantially smaller increase of the absolute error).

StepRange objects store their start point (0, in this case), length (50 in this case) and step size (100/49 in this case). And the endpoint is re-computed from there. In this case, with infinite precision arithmetic it would be:

julia> 49 * (100//49)
100//1

i.e. exactly what we expect, no error at all. If the step size was computed with double-precision arithmetic, we’d have something like

julia> 49*BigFloat(100/49)
100.000000000000002220446049250313080847263336181640625

julia> 49*BigFloat(100/49)-100.
2.220446049250313080847263336181640625e-15

i.e. an absolute error in the order of 1e-15. But Julia goes an extra step along the way to avoid FP errors, and doubles the working precision when computing the range step:

julia> a = range(0, 100, length=50)
0.0:2.0408163265306123:100.0

julia> a.step
Base.TwicePrecision{Float64}(2.040816326530603, 9.28055818135641e-15)

julia> 49*a.step
Base.TwicePrecision{Float64}(100.0, -5.048709793414476e-29)

julia> 49*a.step - 100.
Base.TwicePrecision{Float64}(-5.048709793414476e-29, 0.0)

In short, the 5e-29 you’re measuring here should not be seen as the (infinite) relative error of a computation returning 0, but rather as an absolute error (rather small if you consider the magnitude of all intermediate results involved).


If you want to avoid such problems with inaccurate end points in ranges, you could perhaps switch to using a LinRange (which stores both end points of the range) instead of a StepRange (which stores the start point in combination with the step size):

julia> a = LinRange(0, 100, 50)
50-element LinRange{Float64}:
 0.0,2.04082,4.08163,6.12245,8.16327,10.2041,12.2449,14.2857,16.3265,18.3673,20.4082,22.449,24.4898,26.5306,28.5714,30.6122,32.6531,34.6939,36.7347,38.7755,40.8163,42.8571,44.898,46.9388,…,53.0612,55.102,57.1429,59.1837,61.2245,63.2653,65.3061,67.3469,69.3878,71.4286,73.4694,75.5102,77.551,79.5918,81.6327,83.6735,85.7143,87.7551,89.7959,91.8367,93.8776,95.9184,97.9592,100.0

julia> b = 100.
100.0

julia> c = a .- b
50-element LinRange{Float64}:
 -100.0,-97.9592,-95.9184,-93.8776,-91.8367,-89.7959,-87.7551,-85.7143,-83.6735,-81.6327,-79.5918,-77.551,-75.5102,-73.4694,-71.4286,-69.3878,-67.3469,-65.3061,-63.2653,-61.2245,-59.1837,-57.1429,…,-40.8163,-38.7755,-36.7347,-34.6939,-32.6531,-30.6122,-28.5714,-26.5306,-24.4898,-22.449,-20.4082,-18.3673,-16.3265,-14.2857,-12.2449,-10.2041,-8.16327,-6.12245,-4.08163,-2.04082,0.0

julia> c[end]
0.0
9 Likes

Tank you for pointing out LinRange, I will use this in my case.

I didn’t expect this absolute error here, because letting the StepRange computing its endpoint and subtracting 100. gives the exact zero.

julia> a = range(1, 100, length=50)
1.0:2.020408163265306:100.0

julia> a[end] - 100.
0.0

Yes. Let me try to explain: computing a[end] produces the exact same absolute error of 5\times 10^{-29}, i.e. the result computed by the last instruction would be approximately 100 - 5\times10^{-29}. But then this result is rounded to the nearest representable FP number. Which is 100 in this case:

  • 100 is itself representable as a FP number (we know that not-too-large integers are always representable), and
  • the two neighbouring representable FP numbers (given by prevfloat(100.) and nextfloat(100.)) are approximately 100 \mp 1.4\times 10^{-14}.

So the result of a[end], rounded to double precision, is exactly 100. And therefore, as you noted:

julia> a[end] - 100.
0.0

In short: because a[end] is (i) not too small and (ii) representable as a FP number, the round-off errors introduced in the computation a.ref + a.len * a.step (which happens with twice the working precision), are so small that they vanish thanks to the last round-off (which happens in double precision).

2 Likes

Okay, that’s it, I missed that in the first case it’s rounded and afterwards subtracted. Thank you for this insight.