Arrow stream usage clarification

Arrow.Stream reads data in record batches. Therefore you need to write data in record batches to a file first. Here is a simple example of splitting your data into two batches:

p = Tables.partitioner([view(df, 1:2000000, :), view(df, :2000001:4000000, :)])
Arrow.write(path, p)
2 Likes