Csv dateformat

Hello,
I want to import data with a structure like that

code,date
0,20190101
1,20190102

If I do

t = CSV.read("file.txt"; dateformat="yyyymmdd")
2×2 DataFrame
│ Row │ code  │ date     │
│     │ Int64 │ Int64    │
├─────┼───────┼──────────┤
│ 1   │ 0     │ 20190101 │
│ 2   │ 1     │ 20190102 │

it does not recognize the date structure treating the date as a Int.
However, for

code,date
0,2019-01-01
1,2019-01-02

it works when I do

t = CSV.read("file.txt"; dateformat="yyyy-mm-dd")
2×2 DataFrame
│ Row │ code  │ date       │
│     │ Int64 │ Date       │
├─────┼───────┼────────────┤
│ 1   │ 0     │ 2019-01-01 │
│ 2   │ 1     │ 2019-01-02 │

Similarly, it works with /.

I thought that CSV could recognize the dateformat yyyymmdd if specified like I do. Are there any reasons I am not seeing on why it does not work ?

I suppose that 20190101 etc are recognized as Ints, so maybe you should try manually specifying the column type as Date. See the docs at

https://juliadata.github.io/CSV.jl/stable/#CSV.File

3 Likes

Thanks ! It works when combining dateformat and types.

t = CSV.read("file.txt"; dateformat = "yyyymmdd", types = Dict(:date => Date) )
2×2 DataFrame
│ Row │ code  │ date       │
│     │ Int64 │ Date       │
├─────┼───────┼────────────┤
│ 1   │ 0     │ 2019-01-01 │
│ 2   │ 1     │ 2019-01-02 │

2 Likes