Convert Rational types to float

Hello everyone,

when converting a Rational type to floating point I’m expecting the result to be up to machine precision. In other words Float64(a//b) should be the closest floating point number to a/b.

I found some instances that do not satisfy this requirement, for example

julia> r = 18940981//3
18940981//3

julia> Float32(r) == Float32(Float64(r))
false

here I’m using Float64 as a proxy for an higher precision calculation. This happens because the numerator is not exactly representable as Float32. I think there exist examples for Float64, but I don’t have one at the moment (I mined this example)

First of all, is my logic correct? Should we care about improving the accuracy of such conversions? I’m using rational numbers to compute high order finite difference stencils, I’m converting them back to floating point when assembling a matrix, so it makes sense to have the most accurate representation

Here is the julia implementation of the conversion:

AbstractFloat(x::Rational) = (float(x.num)/float(x.den))::AbstractFloat
function (::Type{T})(x::Rational{S}) where T<:AbstractFloat where S
    P = promote_type(T,S)
    convert(T, convert(P,x.num)/convert(P,x.den))::T
end

I have also checked the boost rational implementation and they use the same logic. First convert to the floating point type and then perform the division

1 Like
julia> num, den = 18940981, 3
julia> Float32(num) / Float32(den)          # Direct path
6313660.333f0
julia> Float32(Float64(num) / Float64(den))  # Two-step path
6313660.3335f0  # Note subtle difference

The inequality arises because converting the Rational{Int64} 18940981//3 (exactly 6313660 + 1/3) to Float32 directly differs from converting it first to Float64 (higher precision) and then to Float32. On conversion and promotion

1 Like

Right, the algorithm of first converting the numerator and denominator to the floating-point type, and then doing the division, rounds twice whenever the numerator or denominator is not exactly representable, so it is fundamentally not guaranteed to give you exact rounding.

An example for Float64 is:

julia> r2 = (Int(maxintfloat(Float64))+1234567)//7
9007199255975559//7

julia> setprecision(BigFloat, 256);

julia> Float64(r2) == Float64(BigFloat(r2))
false

and one can verify that Float64(r2) is indeed not the closest Float64 value:

julia> abs(Float64(r2) - BigFloat(r2)) > abs(Float64(BigFloat(r2)) - BigFloat(r2))
true

However, I’m not aware of an algorithm that will do better without going to higher precision (or some kind of guard bits).

It’s not crazy for us to improve the conversion of Rational to Float32 (or Float16) by converting to Float64 first.

How does parsing of float literals work? All literals are rational. Granted, literals can only express rationals whose denominators are powers of 10, but powers of 10 aren’t particularly nice in binary, so I’m curious how hard it would be to generalize the algorithm.