Request for feedback on potential CSV.jl feature

Mainly because this has not been mentioned here, I was wondering how much of what is proposed is already possible with a reasonably simple API using CSV + shashi/FileTrees.jl: Parallel file processing made easy (github.com), see also the package announcement.

This probably does not address the parsing / promotion issues, but I thought it could be a relevant reference (it should definitely handle reading multiple files concurrently).

Another useful reference for the API could be JuliaDB.loadtable, which also allows to ingest multiple files at once. It also supports adding a separate column that is populated with the name of each file (or a function thereof). This can actually be pretty useful if some relevant information is encoded in the file name and not in the csv itself.

3 Likes