I started to experiment with BigFloats and I am not sure how to best (and with minimum manual effort) handle numerical constants in functions.

The following shows the behaviour as documented, the constant 2.0/3.0 gets not automatically promoted to BigFloat

julia> setprecision(128)
128
julia> function g_one(x)
2.0/3.0*x
end
g (generic function with 1 method)
julia> g_one(BigFloat(1.0))
6.666666666666666296592325124947819858789e-01

Of course one can do

julia> function g_two(x::T) where {T}
a::T = 2.0
b::T = 3.0
a/b*x
end
g (generic function with 1 method)
julia> g_two(BigFloat(1.0))
6.666666666666666666666666666666666666676e-01

but note the next (this again is as documented, promotion to BigFloat happens after 2.0/3.0 is computed in Float64)

julia> function g_three(x::T) where {T}
a::T = 2.0/3.0
a*x
end
g (generic function with 1 method)
julia> g_three(BigFloat(1.0))
6.666666666666666296592325124947819858789e-01

What confuses me is now this.

Assume the function is something more complicated with a lot of numerical constants. Without rewriting (pulling out the constants, storing them in tuples while making sure they are correctly promoted) it is not possible to just switch to BigFloat and get a cosistent accuracy because Float64 constants do not get promoted ?

How would I reliably and with minimum effort ensure everything has the right precision ? Is there perhaps some premade macro to cast every Float64 constant to BigFloat ? I suspect @generated could be used for this, but this to me appears to be a bit of an overkill for some quick experiment, especially since I never had any contact with it beyond knowing it exists.

Finally, one might observe that in something like 2.0*x where x is a BigFloat this promotion behaviour is completely invisible which might lead to some confusion if one did not read the documentation.

range honours BigFloat - itâ€™s just that BigFloat conversion from floating point literals is icky, since they get converted to floating point first during parsing of the julia source:

help?> BigFloat
# snip
BigFloat(x)
Create an arbitrary precision floating point number. x may be an Integer, a Float64 or
a BigInt. The usual mathematical operators are defined for this type, and results are
promoted to a BigFloat.
Note that because decimal literals are converted to floating point numbers when parsed,
BigFloat(2.1) may not yield what you expect. You may instead prefer to initialize
constants from strings via parse, or using the big string literal.
julia> BigFloat(2.1)
2.100000000000000088817841970012523233890533447265625
julia> big"2.1"
2.099999999999999999999999999999999999999999999999999999999999999999999999999986

OK, I see. I was assuming BigFloat(x) and big"x" were equivalent. The the parser always does Float64 first and calls BigFloat() on the result. And yes,

However, ignoring ChangePrecision.jl, solving my original problem becomes even more awkward

julia> function g(x::T) where {T}
a::T = 0.1
a*x
end
g (generic function with 1 method)
julia> g(BigFloat(1.0))
1.000000000000000055511151231257827021181583404541015625e-01
julia> function g(x::T) where {T}
a= big"0.1"
a*x
end
g (generic function with 1 method)
julia> g(BigFloat(1.0))
1.000000000000000000000000000000000000000000000000000000000000000000000000000002e-01

This cannot be right ? And the what I understand of ChangePrecision.jl indicates that its approach will not work ?

Edit: Of course

julia> function g(x::T) where {T}
a::T = 10.0
1.0/a*x
end
g (generic function with 1 method)
julia> g(BigFloat(1.0))
1.000000000000000000000000000000000000000000000000000000000000000000000000000002e-01

It would probably be possible to to get a ChangePrecision approach to work more ideally if the parser stored the literals as strings or in some decimal format, but that would be quite a big change and very possibly require painful trade-offs in other respects.

What you can do now is to use constructions like 2 * x / 3 or x * 2 / 3 or employ rationals, 2//3 * x.

ChangePrecision doesnâ€™t have much trouble with literals, because it converts them with parse(T, string(literal)) and the grisu algorithm generally outputs the desired string version of floating-point literals even for decimals (that is, it outputs a string equivalent to the one that you used to type the literal). The problem is constructors (like range) â€¦ it is not possible to get it to recognize all possible functions that could default to returning Float64.

ChangePrecision is intended as a hack so that you can take informal user scripts and quickly experiment with a different precisions. If you want to write long-term precision-agnostic code, e.g. library code, then you need to be aware of types and write code that uses your argument types to construct values.

Youâ€™re still using Floating Point literals. It doesnâ€™t matter if they get converted via assignment to a declared BigFloat or directly passed to BigFloat() - the conversion happens at the parsing stage, long before any value assignment is taking place.

As it is right now, your code is not generic since you have those literals still present. The promotion works - but because of the smaller precision than a â€śnormalâ€ť BigFloat, you still get all those inaccuracies of a regular Float, because they are introduced before converting them to a BigFloat. You probably want something like this (EDIT: this is effectively what ChangePrecision.jl does, if I understand correctly):

julia> function g(x::T) where {T <: AbstractFloat}
a = parse(T, "0.1")
a*x
end

Yes, itâ€™s cumbersome to convert all basic literals like 0.1 to the above if you want to use literals and possibly also BigFloats - and it should probably be made easier/possible without having to change so much code. Maybe the behaviour of the parsing can be changed to not wrap floating point literals in a Float (maybe even only when surrounded by BigFloat(...))? Iâ€™d be interested to see some proposals on this, but I think thatâ€™s probably pretty tricky to do.

Ah, yes ! I believe that is what I wanted to know, how to dispatch correctly. That is indeed generic, and one could switch then to DoubleFloats or ArbFloats (or whatever decimal type), too, I believe.

Of course ideally that 0.1 should be replaced by a rational number (i.e. 1/10) to make the intention clear if one is known.

OK, big"1" is a BigInt. Yes, together with the built-in rational type that can go wrong.

I originally thought of something like (in case a mathematical rational number is known at all for a constant)

function g(x::T) where {T}
tenth=parse(T,"1")/parse(T,"10")
tenth*x
end

Of course tenth could be a tuple element to store multiple constants.

I am not sure it makes sense. Probably a bit verbose, but eventually it makes it clear what the original number was. I will have to think about now that I know how to dispatch.

I had expected that it would get slower. On the other hand using BigFloat is already (as expected) so slow compared to Float64 that I do not really care about speed, at the moment at least On top, I see now that I can pull out rational constants only in a few equations.

Still, this is of course very useful for future reference.

I just wanted to test how large the floating point rounding impact on some nasty convergence issue I have actually is. It turns out it most likely is the dominating factor (unless of course I messed something else up). Next step would probably to try DoubleFloats or ArbFloats, may be I can do a bit better than simply demonstrating the likely cause of the problem.

So, this whole discussion was highly appreciated from my side !

It is the circular restricted three-body problem, so an ODE. More specifically, the Poincare sections, see here. This has two singularities at the location of the heavy masses in its potential. Regularization (in my case Thiele-Burrau) can shift them to infinity. The regularization effectively casts cartesian coordinates (and momenta) into expressions which consist of trigonometric and hyperbolic functions with complex arguments (if one wants to work with the Hamiltonian, so canonical positions and momenta, these are extra lengthy). Apparently the rounding error noise is about 1e-12 relative, and of course it gets summed up for longer trajectories. With BigFloats even with only 72 bit this, as far as I have tested it by now, is much lower and integration is much more stable.

Basically, this has been done to death by NASA in the 1960s as prerequisite to the Apollo program. They appear to have been using recursion instead of Runge-Kuttas or similar, see for example here and here, to solve that. I was arrogant and thought modern high order ODE solvers (DifferentialEquations.jl) with step size control would work quite easy. That was wrong in that large part of the problem appears to be floating point accuracy depending on the trajectory and Jacobi integral value.

Or I still have an error somewhere, although I can reproduce most of Brouckes stable solutions. I have no idea how Henon did his calculation.

Edit: To be more precise, I can reproduce Brouckes solutions with stability indices up to about 200 for about two periods, but some trajectories have indices in the 40000 range which is a challenge.

Iâ€™m definitely way in over my head here, Iâ€™m just a lowly CS student

Still, Iâ€™ve dug around a bit and found this paper from Zurich, sponsored by NASA and IBM containing an ALGOL 60 procedure to compute what (I think, if I understood correctly) youâ€™re looking for. Might be completely useless (you probably already know about it), but theyâ€™re using Runge-Kutta (fourth order, see p.53) and also write about how a smaller integration step doesnâ€™t pay off, â€śbecause the error produced by the ephemeris then becomes dominant. However, the integration with a smaller step would improve the balance of the energy equation.â€ť (p.57)

So, either they ran into the same problem you did and didnâ€™t have enough bits to go further (in which case, yay?) or Iâ€™m way too tired to read papers and shouldnâ€™t dig into things Iâ€™m not even remotely familiar with