Let’s say you have a struct Foo, that has 30 or so named fields:
Now you want to store it (without the use of JLD).
What is the go-to method for transforming a vector of these Foo objects into a table?
But it seems like this extremely simple use-case is never spelled out in laymanese
Could you expand? Do you want each Foo object to be a column in a DataFrame, with each row a field, or do you want a vector of Foo objects?
I would like to have:
- each Foo object in the array as a row in the database
- with the columns as the fields (that are assumed to have a fixed-byte representation)
edit: here’s a quick rundown of how it would look:
Does this work?
t = fieldnames(typeof(x))
[getproperty(x, field) for field in t]
df = DataFrame(a = Int, b = Int... allocate the types and names)
for x in VectorOfFoos
You could also do
df = DataFrame(a = [x.a for x in VectorOfFoos], b = [x.b for x in VectorOfFoos]...)
Using (the yet unreleased StructArrays and the latest DataFrames) something like this is possible
julia> using DataFrames, StructArrays
julia> struct Foo
julia> c = [Foo(1, 2.0), Foo(2, 4.0)]
│ Row │ a │ b │
│ │ Int64 │ Float64 │
│ 1 │ 1 │ 2.0 │
│ 2 │ 2 │ 4.0 │
Edit: Updated the answer since master DataFrames makes this easier.
for shorter structs you might also be able do something like
reduce(vcat, map(x-> [x.a x.b], arrFoo))
Add: There might be a performance regression using
struct n; a; b; end;
p = fill(n(rand(1:10), rand()*10), 10^6);
@time mapreduce(x->[x.a x.b], vcat, p);
@time reduce(vcat, map(x-> [x.a x.b], p));