Dates.DateTime format in a DataFrame

I am fetching content of the table and downloading it CSV (using LibPQ, DataFrames and CSV).
This is the code converting DataFrame to CSV:

executeResult = execute(conn, "SELECT * FROM table")
frame = DataFrame(executeResult)
join(CSV.RowWriter(frame))

And this is roughly the result:

date,gascons,irradiation,rain,temperature,windspeed,ph,_label
2004-01-01T00:00:00.0,53875.51,0.0,0.5,0.0,0.0,0.0,0.0
2004-01-01T01:00:00.0,51297.24,0.0,0.3,0.8,5.5,0.0,0.0
2004-01-01T02:00:00.0,51010.4,0.0,0.5,0.6,6.3,0.0,0.0

The problem is that β€œT” in the column with type Dates.DateTime. The resulting CSV should look like this:

date,gascons,irradiation,rain,temperature,windspeed,ph,_label
2004-01-01 00:00:00.0,53875.51,0.0,0.5,0.0,0.0,0.0,0.0
2004-01-01 01:00:00.0,51297.24,0.0,0.3,0.8,5.5,0.0,0.0
2004-01-01 02:00:00.0,51010.4,0.0,0.5,0.6,6.3,0.0,0.0

I achieve it easily, using replace(join(CSV.RowWriter(frame)), "T" => " ") instead of the last line in the code above, but this would work for me only because all other columns are numeric. I also think this is not an efficient solution esp. for large input table (huge string with many T’s to replace). Could I somehow ask DataFrame to supply Dates.DateTime column in a different format?

Thank you.

If you’re already materializing the query result as a DataFrame, you should just use CSV.write instead of CSV.RowWriter. The latter is for when you have a true row-streaming case where you don’t want to materialize the full intermediate table in memory (like creating a DataFrame).

Something like this should work:

transform(col, val) = val
transform(col, val::DateTime) = Dates.format(val, dateformat"yyyy-mm-dd HH:MM:SS.s")
CSV.write("output.csv", frame; transform=transform)

I cannot use CSV.write because I don’t want to write in CSV file, just to hand the data in the string to another service.

You still can use CSV.write since it accepts IOBuffer

julia> df = DataFrame("A" => [1, 2, 3], "B" => ["A", "B", "C"], "T" => DateTime.([0, 1, 2
]))
3Γ—3 DataFrame
 Row β”‚ A      B       T
     β”‚ Int64  String  DateTime
─────┼────────────────────────────────────
   1 β”‚     1  A       0000-01-01T00:00:00
   2 β”‚     2  B       0001-01-01T00:00:00
   3 β”‚     3  C       0002-01-01T00:00:00

julia> transform(col, val) = val
julia> transform(col, val::DateTime) = Dates.format(val, dateformat"yyyy-mm-dd HH:MM:SS.s")
julia> iob = IOBuffer()

julia> CSV.write(iob, df, transform = transform)
julia> String(take!(iob))
"A,B,T\n1,A,0000-01-01 00:00:00.0\n2,B,0001-01-01 00:00:00.0\n3,C,0002-01-01 00:00:00.0\n
"
3 Likes

It works like a charm. Thank you!