Yes, it’d be helpful if you provided some more details here: what kind of format is your data in? csv? feather? excel? something else? Why is processing > 100 columns at a time too big? I ask because on my 5-year-old laptop, I can process certain csv files with 20,000 columns without much trouble.
In the CSV.jl package, a recent addition is the CSV.Rows
type, which allows efficient iteration, row-by-row, over the values in a csv file. It even allows a reusebuffer=true
keyword argument that will allocate a single buffer for the entire file to be re-used while iterating. So you could process an entire file by doing something like:
for row in CSV.Rows(filename; reusebuffer=true)
# do things with row values: row.col1, row.col2, etc. where `col1` is a column name in the csv file
end
Hope that helps?