BigFloat promotion rules and constants in functions

Hello everybody,

I started to experiment with BigFloats and I am not sure how to best (and with minimum manual effort) handle numerical constants in functions.

The following shows the behaviour as documented, the constant 2.0/3.0 gets not automatically promoted to BigFloat

julia> setprecision(128)
128
julia> function g_one(x)
       2.0/3.0*x
       end
g (generic function with 1 method)
julia> g_one(BigFloat(1.0))
6.666666666666666296592325124947819858789e-01

Of course one can do

julia> function g_two(x::T) where {T}
       a::T = 2.0
       b::T = 3.0
       a/b*x
       end
g (generic function with 1 method)
julia> g_two(BigFloat(1.0))
6.666666666666666666666666666666666666676e-01

but note the next (this again is as documented, promotion to BigFloat happens after 2.0/3.0 is computed in Float64)

julia> function g_three(x::T) where {T}
       a::T = 2.0/3.0
       a*x
       end
g (generic function with 1 method)
julia> g_three(BigFloat(1.0))
6.666666666666666296592325124947819858789e-01

What confuses me is now this.

Assume the function is something more complicated with a lot of numerical constants. Without rewriting (pulling out the constants, storing them in tuples while making sure they are correctly promoted) it is not possible to just switch to BigFloat and get a cosistent accuracy because Float64 constants do not get promoted ?

How would I reliably and with minimum effort ensure everything has the right precision ? Is there perhaps some premade macro to cast every Float64 constant to BigFloat ? I suspect @generated could be used for this, but this to me appears to be a bit of an overkill for some quick experiment, especially since I never had any contact with it beyond knowing it exists.

Finally, one might observe that in something like 2.0*x where x is a BigFloat this promotion behaviour is completely invisible which might lead to some confusion if one did not read the documentation.

Best Regards

Have a look at ChangePrecision.jl

Thank you. That looks interesting, although it does not appear to do BigFloat out of the box.

Another vaguely related question:

z=range(BigFloat(0.0),stop=BigFloat(2.0),step=BigFloat(0.1))
0.0:1.000000000000000055511151231257827021181583404541015625e-01:1.9000000000000001054711873393898713402450084686279296875

So range() does not honor BigFloat at all ? That would be an unpleasant suprise if true and highly inconsistent.

EDIT: OK, LinRange() appears to be working. I am not sure what the intended behaviour of range() or the preferred function are.

range honours BigFloat - it’s just that BigFloat conversion from floating point literals is icky, since they get converted to floating point first during parsing of the julia source:

help?> BigFloat
# snip
BigFloat(x)

Create an arbitrary precision floating point number. x may be an Integer, a Float64 or 
a BigInt. The usual mathematical operators are defined for this type, and results are
promoted to a BigFloat.

Note that because decimal literals are converted to floating point numbers when parsed, 
BigFloat(2.1) may not yield what you expect. You may instead prefer to initialize 
constants from strings via parse, or using the big string literal.

julia> BigFloat(2.1)
2.100000000000000088817841970012523233890533447265625

julia> big"2.1"
2.099999999999999999999999999999999999999999999999999999999999999999999999999986

And the range using big"...":

julia> range(big"0.0",stop=big"2.0",step=big"0.1")
0.0:1.000000000000000000000000000000000000000000000000000000000000000000000000000002e-01:2.0

The same can be done using parse(BigFloat, "0.1").

OK, I see. I was assuming BigFloat(x) and big"x" were equivalent. The the parser always does Float64 first and calls BigFloat() on the result. And yes,

julia> parse(BigFloat,readline())
0.1
1.000000000000000000000000000000000000000000000000000000000000000000000000000002e-01

However, ignoring ChangePrecision.jl, solving my original problem becomes even more awkward

julia> function g(x::T) where {T}
       a::T = 0.1
       a*x
       end
g (generic function with 1 method)
julia> g(BigFloat(1.0))
1.000000000000000055511151231257827021181583404541015625e-01

julia> function g(x::T) where {T}
       a= big"0.1"
       a*x
       end
g (generic function with 1 method)
julia> g(BigFloat(1.0))
1.000000000000000000000000000000000000000000000000000000000000000000000000000002e-01

This cannot be right ? And the what I understand of ChangePrecision.jl indicates that its approach will not work ?

Edit: Of course

julia> function g(x::T) where {T}
       a::T = 10.0
       1.0/a*x
       end
g (generic function with 1 method)

julia> g(BigFloat(1.0))
1.000000000000000000000000000000000000000000000000000000000000000000000000000002e-01

but, really …

It would probably be possible to to get a ChangePrecision approach to work more ideally if the parser stored the literals as strings or in some decimal format, but that would be quite a big change and very possibly require painful trade-offs in other respects.

What you can do now is to use constructions like 2 * x / 3 or x * 2 / 3 or employ rationals, 2//3 * x.

ChangePrecision doesn’t have much trouble with literals, because it converts them with parse(T, string(literal)) and the grisu algorithm generally outputs the desired string version of floating-point literals even for decimals (that is, it outputs a string equivalent to the one that you used to type the literal). The problem is constructors (like range) … it is not possible to get it to recognize all possible functions that could default to returning Float64.

ChangePrecision is intended as a hack so that you can take informal user scripts and quickly experiment with a different precisions. If you want to write long-term precision-agnostic code, e.g. library code, then you need to be aware of types and write code that uses your argument types to construct values.

1 Like

You’re still using Floating Point literals. It doesn’t matter if they get converted via assignment to a declared BigFloat or directly passed to BigFloat() - the conversion happens at the parsing stage, long before any value assignment is taking place.

As it is right now, your code is not generic since you have those literals still present. The promotion works - but because of the smaller precision than a “normal” BigFloat, you still get all those inaccuracies of a regular Float, because they are introduced before converting them to a BigFloat. You probably want something like this (EDIT: this is effectively what ChangePrecision.jl does, if I understand correctly):

julia> function g(x::T) where {T <: AbstractFloat}
         a = parse(T, "0.1")
         a*x
       end

Yes, it’s cumbersome to convert all basic literals like 0.1 to the above if you want to use literals and possibly also BigFloats - and it should probably be made easier/possible without having to change so much code. Maybe the behaviour of the parsing can be changed to not wrap floating point literals in a Float (maybe even only when surrounded by BigFloat(...))? I’d be interested to see some proposals on this, but I think that’s probably pretty tricky to do.

Right, I stand corrected.

Ah, yes ! I believe that is what I wanted to know, how to dispatch correctly. That is indeed generic, and one could switch then to DoubleFloats or ArbFloats (or whatever decimal type), too, I believe.

Of course ideally that 0.1 should be replaced by a rational number (i.e. 1/10) to make the intention clear if one is known.

Thank you all for your answers !

That is interesting. I will read it as soon as I am back at work, apparently this does not use a conversion tolerance. Thank you !

Careful, 1/10 != 1//10! Also, using 1//10 could promote everything to Rational:

julia> function g(x::T) where {T}
          a = 1//10
          a*x
       end
g (generic function with 1 method)

julia> g(big"10")
1//1

julia> typeof(ans)
Rational{BigInt}

That may be more suitable for what you want though, as it still does what you expect for BigFloat:

julia> g(big"10.0")
1.0

julia> typeof(ans)
BigFloat

Bear in mind, there can still be problems.

julia> g(big"0.1")
9.999999999999999999999999999999999999999999999999999999999999999999999999999995e-03

In general, sticking to the explicit parsing with the given argumenttype is most generic.

OK, big"1" is a BigInt. Yes, together with the built-in rational type that can go wrong.

I originally thought of something like (in case a mathematical rational number is known at all for a constant)

function g(x::T) where {T}
tenth=parse(T,"1")/parse(T,"10")
tenth*x
end

Of course tenth could be a tuple element to store multiple constants.

I am not sure it makes sense. Probably a bit verbose, but eventually it makes it clear what the original number was. I will have to think about now that I know how to dispatch.

Also very slow.

If you want to ensure a floating-point result you could do something like:

function g(x::T) where {T<:Number}
    tenth::float(real(T)) = 1//10
    return tenth*x
end

Well, not that much though:

julia> function g(x::T) where {T<:Number}
           tenth::float(real(T)) = 1//10
           return tenth*x
       end
g (generic function with 1 method)

julia> function f(x::T) where {T <: AbstractFloat}
           a = parse(T, "0.1")
           a*x
       end
f (generic function with 1 method)

julia> @benchmark g($x)
BenchmarkTools.Trial:
  memory estimate:  448 bytes
  allocs estimate:  8
  --------------
  minimum time:     287.234 ns (0.00% GC)
  median time:      290.780 ns (0.00% GC)
  mean time:        348.059 ns (9.57% GC)
  maximum time:     124.752 ÎĽs (99.65% GC)
  --------------
  samples:          10000
  evals/sample:     282

julia> @benchmark f($x)
BenchmarkTools.Trial:
  memory estimate:  228 bytes
  allocs estimate:  5
  --------------
  minimum time:     402.985 ns (0.00% GC)
  median time:      407.960 ns (0.00% GC)
  mean time:        467.812 ns (7.00% GC)
  maximum time:     173.055 ÎĽs (99.70% GC)
  --------------
  samples:          10000
  evals/sample:     201

But yours will scale better with bigger literals, because parsing that string is slow.

I had expected that it would get slower. On the other hand using BigFloat is already (as expected) so slow compared to Float64 that I do not really care about speed, at the moment at least :slight_smile: On top, I see now that I can pull out rational constants only in a few equations.

Still, this is of course very useful for future reference.

I just wanted to test how large the floating point rounding impact on some nasty convergence issue I have actually is. It turns out it most likely is the dominating factor (unless of course I messed something else up). Next step would probably to try DoubleFloats or ArbFloats, may be I can do a bit better than simply demonstrating the likely cause of the problem.

So, this whole discussion was highly appreciated from my side !

If you’re caring about an error term, maybe Measurements.jl is for you:

No, will not help I believe.

It is the circular restricted three-body problem, so an ODE. More specifically, the Poincare sections, see here. This has two singularities at the location of the heavy masses in its potential. Regularization (in my case Thiele-Burrau) can shift them to infinity. The regularization effectively casts cartesian coordinates (and momenta) into expressions which consist of trigonometric and hyperbolic functions with complex arguments (if one wants to work with the Hamiltonian, so canonical positions and momenta, these are extra lengthy). Apparently the rounding error noise is about 1e-12 relative, and of course it gets summed up for longer trajectories. With BigFloats even with only 72 bit this, as far as I have tested it by now, is much lower and integration is much more stable.

Basically, this has been done to death by NASA in the 1960s as prerequisite to the Apollo program. They appear to have been using recursion instead of Runge-Kuttas or similar, see for example here and here, to solve that. I was arrogant and thought modern high order ODE solvers (DifferentialEquations.jl) with step size control would work quite easy. That was wrong in that large part of the problem appears to be floating point accuracy depending on the trajectory and Jacobi integral value.

Or I still have an error somewhere, although I can reproduce most of Brouckes stable solutions. I have no idea how Henon did his calculation.

Edit: To be more precise, I can reproduce Brouckes solutions with stability indices up to about 200 for about two periods, but some trajectories have indices in the 40000 range which is a challenge.

I’m definitely way in over my head here, I’m just a lowly CS student :smile:

Still, I’ve dug around a bit and found this paper from Zurich, sponsored by NASA and IBM containing an ALGOL 60 procedure to compute what (I think, if I understood correctly) you’re looking for. Might be completely useless (you probably already know about it), but they’re using Runge-Kutta (fourth order, see p.53) and also write about how a smaller integration step doesn’t pay off, “because the error produced by the ephemeris then becomes dominant. However, the integration with a smaller step would improve the balance of the energy equation.” (p.57)

So, either they ran into the same problem you did and didn’t have enough bits to go further (in which case, yay?) or I’m way too tired to read papers and shouldn’t dig into things I’m not even remotely familiar with :smiley:

You may already by familiar with these options, but just in case, have you tried the symplectic integrators?

Alternatively, you can try manifold conversation and projection.

According to DiffEq’s FAQ, it’s better to use callbacks for energy conservation. The Henon-Heiles Energy Conservation Benchmark and Quadrupole Boson Hamiltonian Energy Conservation Benchmark go into more detail.

Hopefully that helps!

EDIT: In fact, I just noticed you mentioned Henon’s calculation, so you should definitely give these benchmarks a look!