Significant digits in an csv file

DaWa · April 13, 2022, 11:22am

Hello all,

I am reading in a lot of CSV files and create a DF using

DataFrame(CSV.File("C:/Users/.../$id.csv", header=["date", "tmax"]))

Now I would like to have the number of significant digits of tmax as a new column of my df. Since tmax are Floats ndigit does not work. Is there a way to easily do this?

Thanks, Daniel

jd-foster · April 13, 2022, 12:33pm

Try length(digits(x)) where x is one or both of the two parts obtained from Base.Math.modf

rafael.guerra · April 13, 2022, 12:43pm

FWIW, in case this helps.
The numbers are loaded as strings for processing the digits and parsed to floats later:

using CSV, DataFrames

const input="""
date,tmax
1-Jan-2000,5.7E3
2-Feb-2001,31.88700
3-Mar-2002,100e0
4-Apr-2003,100.1
"""

df = DataFrame(CSV.File(IOBuffer(input), types=Dict(:date=>String, :tmax=>String)))

str = split.(string.(first.(split.(lowercase.(df.tmax),'e'))),'.')
df.sdigs = zeros(Int, nrow(df))
for (i,s) in pairs(str)
    df.sdigs[i] = length(s[1])
    (length(s)>1) && (df.sdigs[i] += length(rstrip(s[2],'0')))
end
df.tmax = parse.(Float64, df.tmax)
df

# Result:
 Row │ date        tmax      sdigs 
     │ String      Float64   Int64
─────┼─────────────────────────────
   1 │ 1-Jan-2000  5700.0        2
   2 │ 2-Feb-2001    31.887      5
   3 │ 3-Mar-2002   100.0        3
   4 │ 4-Apr-2003   100.1        4

DaWa · April 13, 2022, 2:03pm

Thanks to both of you for the reply. I think jd-fosters answer does not work, since Base.Math.modf still returns floats. Also there would be issues with 0 at the end and numbers like 1.1.

rafael.guerra I really like your answer. I am not sure about
(length(s)>1) && (df.sdigs[i] += length(rstrip(s[2],‘0’)))

since that removes significant 0’s at the end, doesn’t it?

rafael.guerra · April 13, 2022, 2:16pm

Daniel (@DaWA), yes that line is there to remove trailing zeros from the fractional part of numbers like 31.88700, but maybe those should be kept?

Frankly, the code was just to give some ideas, as the general problem is beyond me.

DaWa · April 13, 2022, 3:32pm

All right, thanks alot! So yeah those 0’s are significant and should be kept. Your code works great. Really helped me. Also very nice to already include a solution for scientific number notation!

Topic		Replies	Views
Get length of a fully represented float General Usage float	12	378	February 21, 2024
Compare numbers at the stated precision General Usage numbers , precision	9	597	September 5, 2022
CSV.read with really small decimal value Data	3	627	September 3, 2018
Rounding when reading in decimals from CSV New to Julia	4	565	June 14, 2019
Find number of digits in a non-integer number? General Usage	15	2308	December 23, 2020

Significant digits in an csv file

Related topics