I added something like that to CSVFiles.jl recently:
load("foo.csv", colparsers=Dict(:colA=>nothing, :colC=>nothing)) |> DataFrame
Essentially when you assign nothing
as the colparser for a given column, it will be skipped entirely.
What I don’t have yet is a nice (positive) column selection API. My goal is to make
load("foo.csv") |> @select(:colA, :colB)
work with this, i.e. even though it would look as if you are selecting columns after they are read, the design of Query.jl and CSVFiles.jl is such that I can get this to never actually read any column other than colA
and colB
. The goal is to support the full column selection story from the @select
command.