Higher precision leads to incorrect results?

I’m working through Arthur & Vassilvitskii’s \Omega(n^2) lower bound proof for the iterative closest point (ICP) algorithm in \mathbb{R} (Theorem 3.4 in the paper) and thought it’d be neat to visualize their construction in Julia and Pluto.

However, I’m running into what I believe must be precision issues. Having never worked with arbitrary precision arithmetic before, there may be something I’m missing.

The construction relies on a subset of adjacent points in \mathcal{B} being increasingly close together, i.e. b_i = \sum_{j=0}^{i-1}\frac{1}{k^j} where k = 3n + 2, i.e. their distances get very small for larger n and i.

So I tried sprinkling my code with BigFloats (by which I mean, I believe to have turned all arithmetic operands into BigFloats – although if there’s a more global and less manual/messy way to do that, I’d be happy to hear about it) and using setprecision(...), but the results I’m getting are very inconsistent and confusing to me.

Here’s a quick video demonstration of what I’m about to describe in text:

For n=8 the ICP run I’m getting when I have setprecision(2000) looks correct, i.e. ICP takes 82 iterations. Increasing n with that precision, however, will lead to incorrect iteration counts – sometimes they’re very high, sometimes extremely low (linear, even).

As can be seen in the video, even keeping n = 8 and increasing precision leads to wild fluctuations in ICP iterations needed.

  • n=8 and setprecision(2000): 82 iterations :white_check_mark:
  • n=8 and setprecision(3000): 19 iterations :x:
  • n=8 and setprecision(4000): 10 iterations :x:
  • n=8 and setprecision(5000): 82 iterations :white_check_mark:
  • n=8 and setprecision(6000): 10 iterations :x:
  • n=8 and setprecision(7000): 10 iterations :x:
  • n=8 and setprecision(8000): 64 iterations :x:
  • n=8 and setprecision(9000): 10 iterations :x:
  • n=8 and setprecision(10000): 10 iterations :x:

Could somebody explain this behaviour to me, and what I’m missing? I’d expect higher precision to result in the same (correct) 82 iterations for n=8, and to help get better results for higher n.

It’s impossible to say for sure without seeing your code, but my guess would be that you accidentally are doing some portion of the calculation at lower precision.

e.g. if you do x * 1.1, then even if x is a BigFloat, the literal constant 1.1 is a Float64 value, which means it is the decimal value 1.1 rounded to the closest value in 64-bit binary floating point. Even though the output is BigFloat, because one of your inputs was the imprecise 1.1, you have suddenly introduced an error much larger than the BigFloat precision into your calculation:

julia> big"1.1" - 1.1
-8.881784197001252323389053344726562499999999999999999999999999654553257796222215e-17

So make sure that any decimal literal constants that are not exact in Float64 (i.e. not an integer times a power of 2) are expressed as big"..." so that they are rounded to the closest BigFloat value.

7 Likes

You can use ChangePrecision.jl to easily change all the literals in an expression to BigFloat:

julia> @changeprecision BigFloat begin
           big"1.1" - 1.1
       end
0.0
6 Likes

Can you show us more code?

Here is the code for the Pluto.jl notebook

I’m not using a lot of literals, and earlier when I did wrap the few I use into BigFloat(1), that didn’t seem to fix the issue. If someone can point out where I’m accidentally operating at lower precision, I’d appreciate it.

I did now use @changeprecision as @stillyslalom recommended, but it doesn’t seem to have resolved the issue.

I see a few expressions such as ℓ::BigFloat = sum([1 / k^(j-1) for j ∈ 1:n+1]). setprecision has no effect here. The right side is being evaluated with Float64 based arithmetic and then the result is converted to BigFloat.

edit: Never mind you are now using the @changeprecision macro.

1 Like

Note that BigFloat(1.1) != big"1.1", as the former converts the floating point number, while the latter creates a literal big float. But if @changeprecision didn’t help, I do not expect this to be the problem.

4 Likes

Good to know, thank you. I was not aware of that. However, as you say, I’m using @changeprecision now and am still getting the aforementioned issues, so this probably wasn’t the problem.

@changeprecision is a fun hack, but it is not a panacea — it doesn’t necessarily know about all possible functions that might produce a floating-point output.

1 Like

I suggest you rewrite your code to mainly remove type annotations. Write small functions that are type-generic, i.e. which take objects of any type. structs should be parametrized by a type T. In that way you should be able to pass around objects of either Float64 or BigFloat type, which is determined by the initial data.

Then it should be easier to isolate which functions are causing the problem, in which case you can use something like

function f(x::T) where {T}
   return T(1) * x
end

to get something of the correct type.

7 Likes
function f(x::T) where {T}
   return T(1) * x
end

@dpsanders Wouldn’t I run into the same problem @Elrod described here? Namely that T(1) for T == BigFloat just converts the literal’s default type to BigFloat, as opposed to creating the BigFloat closest to the literal value?

function f(x::T) where {T}
   return T(1.1) - x
end

f(big"1.1") == big"0" # false

Then again, I don’t believe I’m really using any literals that should lead to imprecision, like 1.1. I’m only using integers 0, 1, 2, 3, and the clean fraction 0.5 as far as I can see, none of which should introduce precision errors – or should they?

function f(x, y)
   return x - y
end

f(big"0.5", 0.5) == big"0"                  # true
f(big"0", 0) == big"0"                      # true
f(big"1", 1) == big"0"                      # true
f(big"2", 2) == big"0"                      # true
f(big"3", 3) == big"0"                      # true
big"1" / big"3"^big"2" == BigFloat(1) / 3^2 # true

Exactly, if you’re using integer literals they will automatically be correctly promoted to BigFloat without you even manually converting.

Just start from a BigFloat at the beginning and pass it through everywhere.

I tried several things but I still do not see which operations in this Pluto.jl notebook would be performed at low precision, and thus lead to the unexpected behaviour I’m encountering.

To be specific, I’d expect the cell

construction, t, c = run_full_construction(n);

to result in 82 ICP iterations for n=8, even when setting the precision to a value above 2000.

I’d also expect sufficiently high precision values to result in (n+1)^2 + 1 iterations for other n.

1 is exactly represented (as both an Int and a BigFloat), so the conversion to BigFloat is exact.

Similarly, T(1.25) is exact, because 1.25 is exactly represented by Float64 (and BigFloat).

The problem comes for something like T(1/3), because that first computes 1/3 inexactly in Float64 (1/3 is not exactly representable in binary floating point) before converting to T, so you are limited by the accuracy of the first Float64 step.

But by the same token, 1 * x and 1.25 * x are perfectly fine too — they will be promoted to BigFloat anyway if x is a BigFloat. (In fact, IIRC there are fast methods for Int * BigFloat and Float64 * BigFloat that avoid the intermediate conversion.)

1 Like

Which is why I don’t understand how I might be getting lower precision results although the only literals I’m using are 0 , 1 , 2 , 3 , and 0.5.

To encourage anyone willing to look at my code, here are the relevant structs and functions, with all Pluto and plotting stuff stripped away:

struct Region
	A::Vector{BigFloat} # point set A
	B::Vector{BigFloat} # point set B
	width::BigFloat     # largest (a, b) distance

	function Region(A::Vector{BigFloat}, B::Vector{BigFloat})
		width::BigFloat = maximum([abs(a-b) for a ∈ A, b ∈ B])
		return new(sort(A), sort(B), width)
	end
end

struct ICPConstruction
	regions::Vector{Region}
	A::Vector{BigFloat} # point set A
	B::Vector{BigFloat} # point set B
	δ::BigFloat         # largest (a, b) distance within any region
	
	"Assembles full ICP construction with A and B point sets (vectors)."
	function ICPConstruction(regions::Vector{Region})
		# extract, concatenate, and sort overall A and B from regions
		A = reduce(vcat, map(r -> r.A, regions)) |> sort
		B = reduce(vcat, map(r -> r.B, regions)) |> sort
		δ = reduce(vcat, map(r -> r.width, regions)) |> maximum
		return new(regions, A, B, δ)
	end
end

function add_region(c::ICPConstruction, new::Region)::ICPConstruction
	δ::BigFloat = maximum([c.δ, new.width])
	min_old::BigFloat = minimum(vcat(c.A, c.B))
	max_new::BigFloat = maximum(vcat(new.A, new.B))
	shift::BigFloat = abs(min_old - max_new) + 3δ + 1
	
	A′ = new.A .- shift
	B′ = new.B .- shift
	regions = vcat(c.regions, Region(A′, B′))
	return ICPConstruction(regions)
end

function AugmentedLinearShifter(n::BigInt)::ICPConstruction
	# width of augmented linear shifter is (2n + 1)ℓ - 1 / k^n
	k::BigInt = 3n + 2
	ℓ::BigFloat = sum([1 / k^j for j ∈ 0:n])

	# Define A
	A::Vector{BigFloat} = [-2i * ℓ for i ∈ 0:n]

	# Define B
	b₀::BigFloat = 0.0
	b(i) = sum([1 / k^j for j ∈ 0:i-1])
	bᵢ::Vector{BigFloat} = [b(i) for i ∈ 1:n]
	B::Vector{BigFloat} = vcat(b₀, bᵢ)

	# Define Linear Shifter
	linear_shifter = Region(A, B)
	return ICPConstruction([linear_shifter])
end

struct Redirector
	regions::Tuple{Region, Region}
	A::Vector{BigFloat} # A₁ ∪ A₂
	B::Vector{BigFloat} # B₁ ∪ B₂
	v::BigFloat
	y::BigFloat

	function Redirector(v::BigFloat, y::BigFloat, k::BigInt)
		# Region 1
		a::BigFloat = 0.5k * v - y # a is y to the left of the (b₁ b₂) midpoint
		b₁::BigFloat = 0.0
		b₂::BigFloat = k * v
		A₁ = [a]
		B₁ = [b₁, b₂]
		R₁ = Region(A₁, B₁)

		# Region 2
		a′::BigFloat = 0.0
		A₂ = [a′]
		B₂ = [a′ + 0.5]
		R₂ = Region(A₂, B₂)

		regions = (R₁, R₂)
		A = reduce(vcat, map(r -> r.A, regions)) |> sort
		B = reduce(vcat, map(r -> r.B, regions)) |> sort
		return new((R₁, R₂), A, B, v, y)
	end
end

function icp1d(construction::ICPConstruction; k = big"0")
	A::Vector{BigFloat} = copy(construction.A)
	B::Vector{BigFloat} = copy(construction.B)

	t::BigFloat = 0.0
	correspondences::Vector{Int64} = []
	for a ∈ A
		neighbor = argmin(abs.(B .- a))
		push!(correspondences, neighbor)
	end

	# Remember transformations for visualization
	ts::Vector{BigFloat} = []
	cs::Vector{Vector{Int64}} = []
	
	while true
		# remember iteration state for visualization
		push!(ts, t)
		push!(cs, copy(correspondences))
		
		# find nearest neighbours
		BNeighbours = @view B[correspondences]

		# calculate mean for each point cloud
		a0::BigFloat = k == 0 ? mean(A) : sum(A) / k
		b0::BigFloat = k == 0 ? mean(BNeighbours) : sum(BNeighbours) / k

		# compute transformations
		t = b0 - a0
		A .+= t

		# compute new correspondences
		correspondences = []
		for a ∈ A
			neighbor = argmin(abs.(B .- a))
			push!(correspondences, neighbor)
		end
		
		# terminate if nearest neighbours don't change
		if correspondences == cs[end]
			push!(ts, t)
			break
		end
	end
	
	return ts, cs
end

function run_full_construction(n::BigInt)
	k::BigInt = 3n + 2
	ℓ::BigFloat = sum([1 / k^j for j ∈ 0:n])

	construction = AugmentedLinearShifter(n);
	redirectors = [Redirector(ℓ + 1, (2i + 1)ℓ, k) for i ∈ 0:n-1]
	redirector_regions = map(r -> collect(r.regions), redirectors) |> Iterators.flatten |> collect

	for region ∈ redirector_regions
		construction = add_region(construction, region)
	end

	# determine kickoff translation
	x₁::BigFloat = icp1d(construction, k=k)[1][2]

	a′::BigFloat = big"0.0"
	b′::BigFloat = a′ + (1.0 - x₁)k
	kicker_region = Region([a′], [b′])

	# complete construction with kicker region
	construction = add_region(construction, kicker_region)
	@assert length(construction.A) == k "k must equal |A|"
	
	# run ICP on fill construction
	translations, correspondences = icp1d(construction)
	return construction, translations, correspondences
end

The ICP algorithm is run via

setprecision(2000)
n::BigInt = 4
construction, t, c = run_full_construction(n);

and length(t) gives the number of iterations.

If anyone can glance where in this code some low precision computation could’ve sneaked in, I’d be very curious to find out indeed. Everything is BigFloat or BigInt (except one use of Int64 for indices), and literals are only the aforementioned integers and 0.5, none of which should induce imprecision as far as I can tell.

Thanks for the reproducible example, but I think the definition of run_full_construction is missing for us to actually be able to reproduce the calculation.

1 Like

Apologies, that’s silly of course! I added the definition of run_full_construction to the previous code post. Thank you so much for having a look. :slight_smile:

Here’s the code ready to be saved and run in a command line as follows, with the two command line arguments n and precision:

(base) chrisoffner@MBP Desktop % julia icp_precision.jl 8 2000

🌟 Running one-dimensional ICP for n = 8 with precision 2000...
🌟 ICP took 82 iterations. 82 were expected. 🟢

(base) chrisoffner@MBP Desktop % julia icp_precision.jl 8 3000

🌟 Running one-dimensional ICP for n = 8 with precision 3000...
🌟 ICP took 19 iterations. 82 were expected. 💔

(base) chrisoffner@MBP Desktop % julia icp_precision.jl 9 2000

🌟 Running one-dimensional ICP for n = 9 with precision 2000...
🌟 ICP took 21 iterations. 101 were expected. 💔

(base) chrisoffner@MBP Desktop % julia icp_precision.jl 8 5000

🌟 Running one-dimensional ICP for n = 8 with precision 5000...
🌟 ICP took 82 iterations. 82 were expected. 🟢
1 Like

Based on the following experiment

ls = map(20:20000) do p
    setprecision(p)
    construction, t, c = run_full_construction(BigInt(8));
    length(t)
end

plot(map(1:1:17980) do i
    mean(==(82),  ls[i:i+2000])
end)

icp
I would say that there is no correlation between precision and expected results.

I don’t know if this is useful but for precision larger than 50, iteration count was always of the form 9k+1, k\le 9.

Thank you, although I’m no sure what this tells us.

I only know that for n = 8 and precision 2000 it behaves as expected, all the Redirector regions “fire” in the expected cascade from right to left, i.e. the right red point in one of them changes its nearest neighbour (indicated by the grey lines between red and blue points) every n+1 iterations until each of the bunch of red points on the very right (the “Linear Shifter”) have traversed the Linear Shifter’s blue points in \mathcal{O}(n) points, respectively, which leads us to the desired quadratic running time.

Not quite sure how to interpret that. In the code we define k = 3n + 2, which means for n = 8 we have k = 26, and for precision 2000 (> 50) we get 82 iterations, which is unequal to 9k+1 = 235, so I’ll assume you’re not referring to the k in the code.

In any event, shouldn’t the number of iterations stay constant from a certain precision on up?