I must work with a dataset of float numerical entries. A few of them (like 0.1%) are really small and thus are written in e-notation (e.g. 1.87589e-17). I’m passing this dataset to a C program via Julia and apparently it does not recognize 1.87589e-17 as a numerical value ( ). Therefore I’d like to convert these numbers to their fully expanded format. So 0.000000000000000187589 (didn’t count the actual zeros).
To do this I can use @printf "0.*f" 22 1.87589e-17. The problem is that 22 is not always the right length for the expansion. So I would like to know if there is a formula that will return the number of digits after the decimal (22 here) given a number ?
I know I could pick a large length and apply to all numbers, but this would greatly increase the size of the data files.
There is no floating point number equal to 1//9. Every IEEE754 floating point number has an finite decimal representation because every finite power of 2 has a finite decimal representation.
julia> big(1/9)
0.111111111111111104943205418749130330979824066162109375
julia> big(1/9) |> nextfloat # notice all these "spare" digits
0.1111111111111111049432054187491303309798240661621093750000000000000000000000011
Although many fewer digits than this are necessary to resolve the float uniquely. That numbers is roughly given by the ndigits function suggested above (maybe add 1 to be safe? I haven’t though hard about it).
In general, the process of efficiently determining the minimal number of decimal digits required to exactly represent a particular binary floating point number is a hard problem on which many academic papers have been published (see grisu, ryū; Julia itself uses the latter).
You can also take the long and dirty route of manipulating strings.
Trigger warning - following isn’t pretty:
function deexponent(fp)
s = string(fp)
m = match(r"([^e]*)e([\-0-9]*)",s)
isnothing(m) && return s
if s[1] == '-'
neg = true
s = s[2:end]
else
neg = false
end
e = parse(Int, m.captures[2])
m2 = match(r"([0-9]*).([0-9]*)", s)
if isnothing(m2)
bd = m.captures[1]
else
bd, ad = m2.captures[1], m2.captures[2]
end
blen = length(bd)
alen = length(ad)
c = blen + e
if c < 0
res = "0."*"0"^(-c) * bd * ad
elseif c > alen+blen
res = bd * ad * "0"^(c-alen-blen) * ".0"
else
res = (bd*ad)[1:c] * "." * (bd*ad)[c+1:end]
end
neg ? "-"*res : res
end
I would suspect that may have rounding issues — changing powers of 10 can change where the values round since 10 isn’t a power of 2. So a decimal that is the shortest representation at one particular power of 10 isn’t necessarily going to round the same way at another.
Try also code below using significant digits and based on this other post.
using Printf
function full_float_signif(x::Float64, sigdig::Int)
(x == 0) && (return (1, "0"))
x = round(x, sigdigits=sigdig)
n = length(@sprintf("%d", abs(x))) # length of the integer part
if (x ≤ -1 || x ≥ 1)
decimals = max(sigdig - n, 0) # 'sig - n' decimals needed
else
Nzeros = ceil(Int, -log10(abs(x))) - 1 # No. zeros after decimal point before first number
decimals = sigdig + Nzeros
end
s = @sprintf("%.*f", decimals, x)
return length(s), s
end
# Example:
full_float_signif(1.87589e-17, 6) # (24, "0.0000000000000000187589")