Significant digits in an csv file

Hello all,

I am reading in a lot of CSV files and create a DF using

DataFrame(CSV.File("C:/Users/.../$id.csv", header=["date", "tmax"]))

Now I would like to have the number of significant digits of tmax as a new column of my df. Since tmax are Floats ndigit does not work. Is there a way to easily do this?

Thanks, Daniel

Try length(digits(x)) where x is one or both of the two parts obtained from Base.Math.modf

2 Likes

FWIW, in case this helps.
The numbers are loaded as strings for processing the digits and parsed to floats later:

using CSV, DataFrames

const input="""
date,tmax
1-Jan-2000,5.7E3
2-Feb-2001,31.88700
3-Mar-2002,100e0
4-Apr-2003,100.1
"""

df = DataFrame(CSV.File(IOBuffer(input), types=Dict(:date=>String, :tmax=>String)))

str = split.(string.(first.(split.(lowercase.(df.tmax),'e'))),'.')
df.sdigs = zeros(Int, nrow(df))
for (i,s) in pairs(str)
    df.sdigs[i] = length(s[1])
    (length(s)>1) && (df.sdigs[i] += length(rstrip(s[2],'0')))
end
df.tmax = parse.(Float64, df.tmax)
df

# Result:
 Row β”‚ date        tmax      sdigs 
     β”‚ String      Float64   Int64
─────┼─────────────────────────────
   1 β”‚ 1-Jan-2000  5700.0        2
   2 β”‚ 2-Feb-2001    31.887      5
   3 β”‚ 3-Mar-2002   100.0        3
   4 β”‚ 4-Apr-2003   100.1        4
2 Likes

Thanks to both of you for the reply. I think jd-fosters answer does not work, since Base.Math.modf still returns floats. Also there would be issues with 0 at the end and numbers like 1.1.

rafael.guerra I really like your answer. I am not sure about
(length(s)>1) && (df.sdigs[i] += length(rstrip(s[2],β€˜0’)))

since that removes significant 0’s at the end, doesn’t it?

Daniel (@DaWA), yes that line is there to remove trailing zeros from the fractional part of numbers like 31.88700, but maybe those should be kept?

Frankly, the code was just to give some ideas, as the general problem is beyond me.

All right, thanks alot! So yeah those 0’s are significant and should be kept. Your code works great. Really helped me. Also very nice to already include a solution for scientific number notation!