If you have a single large file there are two ways to go about what you want:
-
Read the file into JuliaDB with
loadtablein one go and callDagger.distributeon that table. The only question is whether your memory can take it. Ifloadtablefails, maybe consider TextParse.jl or CSV.jl which will yield a DataFrame which you can turn into a JuliaDB table. -
Split the CSV file first. Write a short piece of code that splits your large file into many small files and use
loadtables distributed functionality.