Hello again. I’m testing reading a large CSV using CSVFiles instead of CSV.jl because the CSV file is too large for CSV.jl on Windows. But I can not use the CSVFile as a iterable. I’m trying:
for i in iter
function test(src:: String)
table = load(File(format"CSV", src);
colnames=[:day, :glnprovider, :glnretailerlocation, :gtin, :inventory, :cost, :sales, :price],
colparsers=[Date, UInt64, UInt64, UInt64, Float32, Float32, Float32, Float32],
return table |> mylength
I’m hoping to use the table as iterable to calculate diferent things in streaming. But, i’m getting the error:
MethodError: no method matching iterate(::CSVFiles.CSVFile)
So, i’m found the document developer guide from iterable tables. In that document, the author says that we can call the method
getiterator. But when i tried that, the error is:
UndefVarError: getiterator not defined
So, how can i use a iterable table (the CSV file) as a iterator? getting the iterator somehow?
getiterator is defined in IteratorInterfaceExtensions.jl, so you need to load that package.
But be warned: CSVFiles.jl currently reads everything into memory, and then iterates from that. So if
load("foo.csv") |> DataFrame doesn’t work because of memory limitations, then using
getiterator will probably also not work (still worth a try, of course).
If you don’t need all of the columns of the file, you can try the new skip column feature that I’ve added to
TextParse#master: make sure you are using that (
pkg> add TextParse#master), and then something like
load("foo.csv", colparsers=Dict(:colA=>nothing, :colC=>nothing)) |> DataFrame should work. In that case,
colB are not being loaded into memory at all.
My next project is to integrate the skip column feature with Query.jl, so that something like
load("foo.csv") |> @select(-:colA, -:colC) |> DataFrame automatically skips those columns during load. No promise on timing, though
EDIT: Oh, and I also plan to add a fully streaming mode at some point. Almost all the pieces for that exist already, so it actually shouldn’t be too difficult, but again, no timeline right now.
Thanks for your time. I’ll check just in case.
Great, any feedback on whether it works would be most welcome