Convert struct fields and their values to DataFrames

Hi, I want to convert following struct to DataFrames:

struct MyStruct
x::Vector{Float64}
y::Vector{Float64}
end

I have written following function to convert MyStruct fields and their values to DataFrame:

function struct_to_dataframe(s::MyStruct)
# Create an empty data frame
df = DataFrame()
fields = fieldnames(s)
# Get the values for each field
values = [getfield(s,field) for field in fields]
push!(df, (field=fields, value=values))
return df
end

#Create instance
s = MyStruct(rand(10),rand(10))
#Convert to DF
df = struct_to_dataframe(s)

But the outcome is something like this:

1×2 DataFrame
Row │ field value
│ Tuple… Array…
─────┼─────────────────────────────────────────────
1 │ (:x, :y) [[0.976849, 0.653888, 0.956756, …

Whereas I want each field and its value in column form. How can I do that? Thanks!!!

Maybe you had a mistake at fieldnames. It require the struct inself, not instance. Use typeof function to get the name of class(structure).

julia> struct MyStruct
           x::Vector{Float64}
           y::Vector{Float64}
           end

julia> function struct_to_dataframe(s)
           fields = fieldnames(typeof(s))
           still_vector = [getfield(s,field) for field in fields]
           return DataFrame(hcat(still_vector...), collect(fields))
       end
struct_to_dataframe (generic function with 1 method)

julia> s = MyStruct(rand(10),rand(10));

julia> struct_to_dataframe(s)
10×2 DataFrame
 Row │ x         y
     │ Float64   Float64
─────┼──────────────────────
   1 │ 0.286858  0.140575
   2 │ 0.253701  0.821031
   3 │ 0.767813  0.474758
   4 │ 0.669886  0.74336
   5 │ 0.463448  0.956407
   6 │ 0.692433  0.715941
   7 │ 0.520875  0.447658
   8 │ 0.783861  0.00811924
   9 │ 0.928598  0.885618
  10 │ 0.681593  0.271318

@rmsmsgood thank you so much! Related to my question, I also want to ask if I set one of the field of MyStruct as an integer then I get a dimension mismatch error. How to take care of that?

Setting one of the filed of YourStruct as an integer, not an integer vector? You mean, this?

struct MyStruct
    x::Vector{Float64}
    y::Vector{Float64}
    k::Int64
end

I guess that should be a dimension error because x,y,k have different length and hcat makes them one matrix. If x,y has length 4, then the matrix has the form below

x[1] y[1] k[1]
x[2] y[2] 😡
x[3] y[3] 😡
x[4] y[4] 😡

so maybe hcat raise dimension mismatch error since :rage: doesn’t exist. You can take care of this in many way.

  1. Make integer to integer vector. repeat function maybe fit for you. eg) k \mapsto [k, k, k, \cdots , k].
  2. Just ignore that. In the for loop, you could check the length of each field or type, so if that is not an array then skip that.

If you mean something like this, the DataFrame constructor can handle automatic broadcasting of scalar elements


struct MS
    x::Vector{Float64}
    y::Vector{Float64}
    k::Int64
end
s = MS(rand(10),rand(10), rand(1:10))
vals=getfield.([s],1:fieldcount(typeof(s)))
using DataFrames
DataFrame(;zip(propertynames(s),vals)...)