I’m reading a large CSV file of many Float64 values and then applying many possible transformations before doing a least squares fit.
I noticed that the type is not plain Float64s despite there being no missing values:
julia> d = CSV.read("/data/m4.csv", NamedTuple)
julia> typeof(d)
NamedTuple{(:hu, :hc, :hf, :hs, :bu, :bs, :cs), NTuple{7, SentinelArrays.ChainedVector{Float64, Vector{Float64}}}}
If I manually convert to plain Float64s (btw is there an easier way to do this?), I get a 28% speed up in my transformations code:
julia> typeof((bs=d.bs[:], bu=d.bu[:], cs=d.cs[:], hc=d.hc[:], hf=d.hf[:], hs=d.hs[:], hu=d.hu[:]))
NamedTuple{(:bs, :bu, :cs, :hc, :hf, :hs, :hu), NTuple{7, Vector{Float64}}}
Is there any way to tell CSV.read
to not use SentinelArrays so I don’t have to do the conversion myself?
EDIT:
As a workaround, I’m using this:
d = JuliaDB.loadtable("/data/m4.csv").columns.columns