Parquet: writing data as row groups

I would appreciate help with my problem. I have data that come in series. After processing raw input I have about 100 columns and about 10^8 rows per series (and about 100 series in total). So far I have been using HDF5 format to store it, but I would like to test if parquet performs better, especially if I select later rows by filtering some specific column values.

I have imagined a parquet file with row groups, one for each series (the columns and the data format will be always the same, but each series might have been taken with different conditions, so the data interpretation may differ). I have tried the Parquet2.jl module, but I’m kind of stuck with how to achieve row groups in a file (or directory). Here’s a piece of code that generates some pseudo-data

function generate_data(N)
    Mmax = 100
    hits = zeros(UInt16, Mmax * 2 + 1, N)

    for i in 1:N
        M = rand(1:10)
        hits[1, i] = UInt16(M)
        dets = sample(1:Mmax, M, replace=false)
        for d in dets
            hits[2*(d-1)+2, i] = round(UInt16, rand() .* 30000.0, RoundDown)
            t = randn() .* 1000.0 + 5000.0
            if t < 0
                t = 0.0
            end
            hits[2*(d-1)+3, i] = round(UInt16, t)
        end
    end

    data = (M=hits[1, :], )
    names = ["E", "t"]
    for j in 1:Mmax
        for k in 1:2
            data = merge(data, (Symbol("$(names[k])_$j") => hits[2*(j-1)+k+1, :], ))
        end
    end
    
    data
end

And this function I’ve tried to append a series to a file

function write_parquet(filename; k=10, N=1_000_000)
    open(filename, write=true) do io
        fw = Parquet2.FileWriter(io)
        for i in 1:k
            data = generate_data(N)
            Parquet2.writetable!(fw, data) 
        end
        Parquet2.finalize!(fw)
    end
end

The resulting file for k=1 has about 82M, and for k=10, 820M, so it seems that the data are written. But upon opening

ds = Parquet2.Dataset(filename)

there is only one group, with 10^6 rows.

In principle it should be possible to append data to unfinished files (e.g. fastparquet write function has “append” keyword that allows to add new rowgroup), but can it be achieved with Parquet2.jl (or Parquet.jl)?

My mistake, the above function works properly. Upon opening the resulting file, there are k groups

ds = Parquet2.Dataset("test.prq")
Parquet2.nrowgroups(ds)

and ds can be iterated over.