Arrow stream usage clarification

bkamins · October 30, 2022, 8:43am

Arrow.Stream reads data in record batches. Therefore you need to write data in record batches to a file first. Here is a simple example of splitting your data into two batches:

p = Tables.partitioner([view(df, 1:2000000, :), view(df, :2000001:4000000, :)])
Arrow.write(path, p)

Topic		Replies	Views
Writing Arrow files by column Performance	1	167	May 8, 2024
Arrow stream writer and reader implementation questions General Usage arrow	0	153	July 10, 2023
Writing Arrow record batch requires a lot of RAM Data arrow	5	980	September 19, 2021
How well Apache Arrow’s zero copy methodology is supported? Data arrow	24	2634	May 1, 2021
General Arrow questions General Usage question , arrow	7	871	February 28, 2022

Arrow stream usage clarification

Related topics