Write data to Arrow file row by row

I have a large amount of data I am loading from an API and I would like to write the output of each call to a “row” of a file in order not to keep it in RAM/ in case the program errors. I have been using the Arrow.jl package for data reading/writing needs recently and really like it. However, I have not found a way to append rows to an Arrow file and was wondering if that is possible?

Thanks!

Here is my current example that does not work:

using Arrow, Tables

dat = [(a=1, b=2), (a=3,b=4)]
open("test.feather", "a+") do io
    for row in dat
        Arrow.write(io, [row])
    end
end

Arrow.Table("test.feather") |> Tables.rowtable
1 Like

Did you find an answer for this?

https://stackoverflow.com/questions/66388141/how-to-append-a-dataframe-to-an-existing-apache-arrow-file-on-disk

I think the answer is no

I ended up using the jsonlines format instead. If the file ends up too large I use it with GZip.jl

I don’t see it in the docs, but

added some support for this.

1 Like