If you have a single large file there are two ways to go about what you want:
-
Read the file into JuliaDB with
loadtable
in one go and callDagger.distribute
on that table. The only question is whether your memory can take it. Ifloadtable
fails, maybe consider TextParse.jl or CSV.jl which will yield a DataFrame which you can turn into a JuliaDB table. -
Split the CSV file first. Write a short piece of code that splits your large file into many small files and use
loadtable
s distributed functionality.