Get length of a fully represented float

Hello,

I must work with a dataset of float numerical entries. A few of them (like 0.1%) are really small and thus are written in e-notation (e.g. 1.87589e-17). I’m passing this dataset to a C program via Julia and apparently it does not recognize 1.87589e-17 as a numerical value ( ). Therefore I’d like to convert these numbers to their fully expanded format. So 0.000000000000000187589 (didn’t count the actual zeros).

To do this I can use `@printf "0.*f" 22 1.87589e-17`. The problem is that 22 is not always the right length for the expansion. So I would like to know if there is a formula that will return the number of digits after the decimal (22 here) given a number ?

I know I could pick a large length and apply to all numbers, but this would greatly increase the size of the data files.

Unfortunately not in general - numbers with recurring digits like `1/9` won’t stop.

Yeah, but this is in the context of a parseable floating point number. Here’s a simple thing that should be sufficient:

``````ndigits(v) = max(0, ceil(Int, -log10(eps(v))))
``````
2 Likes

This looks like this is giving an upper bound, no?
`ndigits(1.)` gives 16 and `ndigits(1.8757e-17)` gives 33.

There is no floating point number equal to `1//9`. Every IEEE754 floating point number has an finite decimal representation because every finite power of 2 has a finite decimal representation.

``````julia> big(1/9)
0.111111111111111104943205418749130330979824066162109375

julia> big(1/9) |> nextfloat # notice all these "spare" digits
0.1111111111111111049432054187491303309798240661621093750000000000000000000000011
``````

Although many fewer digits than this are necessary to resolve the float uniquely. That numbers is roughly given by the `ndigits` function suggested above (maybe add 1 to be safe? I haven’t though hard about it).

1 Like

Yes, being smarter is harder

In general, the process of efficiently determining the minimal number of decimal digits required to exactly represent a particular binary floating point number is a hard problem on which many academic papers have been published (see grisu, ryū; Julia itself uses the latter).

4 Likes

Oh okay, I thought this would be a trivial problem. I’ll settle for your formula then. Thank you.

I’ve edited my answer to do slightly better for large numbers and more specifically target the behavior of the `%0.*f` format.

If you at all have control over that C program, it’d be so much simpler if you could change that parsing behavior to `sscanf` a `%g` format instead.

3 Likes

I could clone the repo, edit, then make a custom jll, but it’s not worth the trouble. What you proposed worked just fine.

You could `reinterpret(UInt64, x)`, then cast back on the C side.

1 Like

You can also take the long and dirty route of manipulating strings.
Trigger warning - following isn’t pretty:

``````function deexponent(fp)
s = string(fp)
m = match(r"([^e]*)e([\-0-9]*)",s)
isnothing(m) && return s
if s[1] == '-'
neg = true
s = s[2:end]
else
neg = false
end
e = parse(Int, m.captures[2])
m2 = match(r"([0-9]*).([0-9]*)", s)
if isnothing(m2)
bd = m.captures[1]
else
end
blen = length(bd)
c = blen + e
if c < 0
res = "0."*"0"^(-c) * bd * ad
elseif c > alen+blen
res = bd * ad * "0"^(c-alen-blen) * ".0"
else
end
neg ? "-"*res : res
end
``````

With this patchy function:

``````julia> deexponent(1.87589e-17)
"0.0000000000000000187589"

julia> 0.0000000000000000187589
1.87589e-17
``````

And other examples work as well.
This method might have string bugs, but doesn’t have log10 bugs.

I would suspect that may have rounding issues — changing powers of 10 can change where the values round since 10 isn’t a power of 2. So a decimal that is the shortest representation at one particular power of 10 isn’t necessarily going to round the same way at another.

1 Like

Try also code below using significant digits and based on this other post.

``````using Printf

function full_float_signif(x::Float64, sigdig::Int)
(x == 0) && (return (1, "0"))
x = round(x, sigdigits=sigdig)
n = length(@sprintf("%d", abs(x)))              # length of the integer part
if (x ≤ -1 || x ≥ 1)
decimals = max(sigdig - n, 0)               # 'sig - n' decimals needed
else
Nzeros = ceil(Int, -log10(abs(x))) - 1      # No. zeros after decimal point before first number
decimals = sigdig + Nzeros
end
s = @sprintf("%.*f", decimals, x)
return length(s), s
end

# Example:
full_float_signif(1.87589e-17, 6)    # (24, "0.0000000000000000187589")
``````