I have a huge dataframe which has 26M rows.How do I write it to a csv file faster and comsume fewer memories?
julia> pdpdpd=vcat(pd,pd,pd)
26590302×6 DataFrame
Row │ chr position pdr discordant sum allsum
│ String Int64 Float64 Int64 Int64 Int64
──────────┼───────────────────────────────────────────────────────
1 │ chr1 10542 0.0 0 1 1
2 │ chr1 10563 0.0 0 1 1
3 │ chr1 10571 0.0 0 1 1
4 │ chr1 10577 0.0 0 1 1
5 │ chr1 10579 0.0 0 1 1
6 │ chr1 10589 0.0 0 1 1
7 │ chr1 10609 1.0 1 1 1
8 │ chr1 10617 1.0 1 1 1
9 │ chr1 10620 1.0 1 1 1
10 │ chr1 10633 1.0 1 1 1
11 │ chr1 10636 1.0 1 1 1
12 │ chr1 10638 1.0 1 1 1
13 │ chr1 10641 1.0 1 1 1
14 │ chr1 10644 1.0 1 1 1
15 │ chr1 10650 1.0 1 1 1
16 │ chr1 10660 1.0 1 1 1
17 │ chr1 10662 1.0 1 1 1
18 │ chr1 10665 1.0 1 1 1
19 │ chr1 10667 1.0 1 1 1
20 │ chr1 10670 1.0 1 1 1
21 │ chr1 13303 NaN 0 0 1
22 │ chr1 13668 NaN 0 0 1
23 │ chr1 13694 NaN 0 0 1
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
26590281 │ chr1 248944853 1.0 1 1 1
26590282 │ chr1 248944864 1.0 1 1 1
26590283 │ chr1 248944868 1.0 1 1 1
26590284 │ chr1 248944875 1.0 1 1 1
26590285 │ chr1 248944889 1.0 1 1 1
26590286 │ chr1 248944897 1.0 1 1 1
26590287 │ chr1 248944900 1.0 1 1 1
julia> @time CSV.write("a.csv",pdpdpd)
40.550157 seconds (718.14 M allocations: 17.845 GiB, 11.81% gc time, 0.28% compilation time)
"a.csv"
It took a lot of time,and cosumes large memeories.What should i do?Thanks.