I can’t share the CSV file because it contains private details. However, it is a simple file, 3 columns of 14,000 rows.
I tried again this morning with:
print("using : ")
@time begin
using DataFrames
using CSV
using Dates
using ZipArchives
using GeoStats
using GeoIO
end
print("Proj : ")
@time import Proj
print("Makie : ")
@time import CairoMakie as Mke
print("CSV : ")
@time postcodes = CSV.read("TerminatedPostcodes.csv", DataFrame)
Which produced the following timing:
using : 189.748529 seconds (4.61 M allocations: 283.892 MiB, 0.17% gc time, 0.11% compilation time: 75% of which was recompilation)
Proj : 1.068109 seconds (22.04 k allocations: 1.244 MiB)
Makie : 78.394334 seconds (2.57 M allocations: 244.475 MiB, 0.35% gc time, 0.01% compilation time)
CSV : 3.507138 seconds (2.60 M allocations: 175.795 MiB, 0.77% gc time, 311.58% compilation time: 9% of which was recompilation)
OK, so not 7 minutes today, but still nearly 5 minutes before it even starts to read my file.
As I say, this is an issue only with the first run of the day. Even if I close VSCode and reopen again, a second run gives the following timing:
using : 10.471567 seconds (4.61 M allocations: 283.954 MiB, 2.99% gc time, 2.06% compilation time: 81% of which was recompilation)
Proj : 0.059018 seconds (22.04 k allocations: 1.244 MiB)
Makie : 5.758698 seconds (2.57 M allocations: 244.475 MiB, 4.43% gc time, 0.13% compilation time)
CSV : 3.401152 seconds (2.60 M allocations: 175.793 MiB, 1.14% gc time, 308.47% compilation time: 9% of which was recompilation)
and within the same VSCode session, it’s slightly faster still:
using : 5.474012 seconds (4.61 M allocations: 283.907 MiB, 5.90% gc time, 3.72% compilation time: 79% of which was recompilation)
Proj : 0.033830 seconds (22.04 k allocations: 1.244 MiB)
Makie : 3.343034 seconds (2.57 M allocations: 244.475 MiB, 8.44% gc time, 0.23% compilation time)
CSV : 3.196636 seconds (2.61 M allocations: 175.897 MiB, 5.95% gc time, 307.54% compilation time: 9% of which was recompilation)
I don’t have any problem with the overhead in these last two examples. It is just the first run of the day that is uniquely painful.
I might try again tomorrow to break down the using
timings a bit more…