Get length of a fully represented float

Hello,

I must work with a dataset of float numerical entries. A few of them (like 0.1%) are really small and thus are written in e-notation (e.g. 1.87589e-17). I’m passing this dataset to a C program via Julia and apparently it does not recognize 1.87589e-17 as a numerical value ( :weary:). Therefore I’d like to convert these numbers to their fully expanded format. So 0.000000000000000187589 (didn’t count the actual zeros).

To do this I can use @printf "0.*f" 22 1.87589e-17. The problem is that 22 is not always the right length for the expansion. So I would like to know if there is a formula that will return the number of digits after the decimal (22 here) given a number ?

I know I could pick a large length and apply to all numbers, but this would greatly increase the size of the data files.

Unfortunately not in general - numbers with recurring digits like 1/9 won’t stop.

Yeah, but this is in the context of a parseable floating point number. Here’s a simple thing that should be sufficient:

ndigits(v) = max(0, ceil(Int, -log10(eps(v))))

This looks like this is giving an upper bound, no?
ndigits(1.) gives 16 and ndigits(1.8757e-17) gives 33.

There is no floating point number equal to 1//9. Every IEEE754 floating point number has an finite decimal representation because every finite power of 2 has a finite decimal representation.

julia> big(1/9)
0.111111111111111104943205418749130330979824066162109375

julia> big(1/9) |> nextfloat # notice all these "spare" digits
0.1111111111111111049432054187491303309798240661621093750000000000000000000000011

Although many fewer digits than this are necessary to resolve the float uniquely. That numbers is roughly given by the ndigits function suggested above (maybe add 1 to be safe? I haven’t though hard about it).

Yes, being smarter is harder :slight_smile:

In general, the process of efficiently determining the minimal number of decimal digits required to exactly represent a particular binary floating point number is a hard problem on which many academic papers have been published (see grisu, ryū; Julia itself uses the latter).

Oh okay, I thought this would be a trivial problem. I’ll settle for your formula then. Thank you.

I’ve edited my answer to do slightly better for large numbers and more specifically target the behavior of the %0.*f format.

If you at all have control over that C program, it’d be so much simpler if you could change that parsing behavior to sscanf a %g format instead.

I could clone the repo, edit, then make a custom jll, but it’s not worth the trouble. What you proposed worked just fine.

You could reinterpret(UInt64, x), then cast back on the C side.

You can also take the long and dirty route of manipulating strings.
Trigger warning - following isn’t pretty:

function deexponent(fp)
    s = string(fp)
    m = match(r"([^e]*)e([\-0-9]*)",s)
    isnothing(m) && return s
    if s[1] == '-'
        neg = true
        s = s[2:end]
    else
        neg = false
    end
    e = parse(Int, m.captures[2])
    m2 = match(r"([0-9]*).([0-9]*)", s)
    if isnothing(m2)
        bd = m.captures[1]
    else
        bd, ad = m2.captures[1], m2.captures[2]
    end
    blen = length(bd)
    alen = length(ad)
    c = blen + e
    if c < 0
        res = "0."*"0"^(-c) * bd * ad
    elseif c > alen+blen
        res = bd * ad * "0"^(c-alen-blen) * ".0"
    else
        res = (bd*ad)[1:c] * "." * (bd*ad)[c+1:end]
    end
    neg ? "-"*res : res
end

With this patchy function:

julia> deexponent(1.87589e-17)
"0.0000000000000000187589"

julia> 0.0000000000000000187589
1.87589e-17

And other examples work as well.
This method might have string bugs, but doesn’t have log10 bugs.

I would suspect that may have rounding issues — changing powers of 10 can change where the values round since 10 isn’t a power of 2. So a decimal that is the shortest representation at one particular power of 10 isn’t necessarily going to round the same way at another.

Try also code below using significant digits and based on this other post.

using Printf

function full_float_signif(x::Float64, sigdig::Int)
    (x == 0) && (return (1, "0"))
    x = round(x, sigdigits=sigdig)
    n = length(@sprintf("%d", abs(x)))              # length of the integer part
    if (x ≤ -1 || x ≥ 1)
        decimals = max(sigdig - n, 0)               # 'sig - n' decimals needed 
    else
        Nzeros = ceil(Int, -log10(abs(x))) - 1      # No. zeros after decimal point before first number
        decimals = sigdig + Nzeros
    end
    s = @sprintf("%.*f", decimals, x)
    return length(s), s
end

# Example:
full_float_signif(1.87589e-17, 6)    # (24, "0.0000000000000000187589")