[ANN] DataConvenience v0.1.2

A number of posts have been asking for a CSV chunk reader and the new major feature for DataConvenience is reasonably fast chunk reader based on CSV.jl.

See GitHub - xiaodaigh/DataConvenience.jl: Convenience functions missing in Julia

CSV Chunk Reader

You can read a CSV in chunks and apply logic to each chunk. The types of each column is inferred by CSV.read .

for chunk in CsvChunkIterator(filepath) 
  # chunk is a DataFrame # do something to df
end

The chunk iterator uses CSV.read parameters. The user can pass in type and types to dictate the types of each column e.g.

# read all column as String 
for chunk in CsvChunkIterator(filepath, type=String) 
  # df is a DataFrame where each column is String # do something to df
end
# read a three colunms csv where the column types are String, Int, Float32 
for chunk in CsvChunkIterator(filepath, types=[String, Int, Float32]) 
  # do something to df
end
5 Likes