I have a number of networks packets with sampled data, that is fragmented and was wondering whether de-fragmentation can be done with DataFrames or if I have to pre-process the data first ?
sort!(df, [:frame, :subframe, :seq, :sampleno])
467×6 DataFrame
Row │ frame subframe seq sampleno nsamples samples
│ UInt8 UInt16 UInt16 UInt32 UInt32 Array{Int8,
─────┼─────────────────────────────────────────────────────────────
1 │ 4 7 13 0 330 [-1, 0, 144…
2 │ 4 7 13 330 16 [0, 0, 0, 0…
3 │ 4 8 0 0 112 [-1, 0, 144…
4 │ 4 8 0 112 234 [16, -118, …
5 │ 4 8 1 0 346 [-1, 0, 144…
The rows with the same (frame
, subframe
and seq
) are fragmented into several rows (packets).
So I want all groups of rows with the same (frame
, subframe
and seq
) to be collapsed to one row with the data in each samples
columns concatenated.
I have read the docs and watched some Tutorials, but could not find much with vector-valued column data. So I tried this
gd = groupby(df, [:frame, :subframe, :seq] )
dfm = combine(gd, :samples => vcat, :nsamples => sum)
But I get the same number of rows in the resulting DataFrame and the samples in samples_vcat
are not concatenated. The only thing that worked as I expected is the nsamples_sum
column, which seems to contain the sum of all nsamples
in each group in gd
.
Any help is much appreciated.