How to combine multiple SummaryStats objects into a single DataFrame?

I am apparently still missing some basic data manipulation skills. Is there a quick way to create a dataframe where the first row contains sx, and second row contains sy with the appropriate column names?

julia> using StatsBase, DataFrames

julia> sx = summarystats(rand(10))
Summary Stats:
Length:         10
Missing Count:  0
Mean:           0.534381
Std. Deviation: 0.327727
Minimum:        0.046431
1st Quartile:   0.375997
Median:         0.521883
3rd Quartile:   0.772281
Maximum:        0.980553


julia> sy = summarystats(rand(10))
Summary Stats:
Length:         10
Missing Count:  0
Mean:           0.606235
Std. Deviation: 0.196903
Minimum:        0.377681
1st Quartile:   0.477612
Median:         0.566537
3rd Quartile:   0.719057
Maximum:        0.960698

I can write my own function to do this by extracting each element, but it seems like there should be a one-liner.

No, there is nothing built-in for this. You could do DataFrame([sy])?

1 Like

Thanks. This is much better than what I was thinking:

julia> df = DataFrame([sx])
1Γ—9 DataFrame
 Row β”‚ mean      sd        min        q25       median    q75      max       nobs   nmiss 
     β”‚ Float64   Float64   Float64    Float64   Float64   Float64  Float64   Int64  Int64 
─────┼────────────────────────────────────────────────────────────────────────────────────
   1 β”‚ 0.480167  0.267154  0.0865253  0.261997  0.532114  0.66425  0.801299     10      0

julia> df = vcat(df, DataFrame([sy]))
2Γ—9 DataFrame
 Row β”‚ mean      sd        min        q25       median    q75       max       nobs   nmiss 
     β”‚ Float64   Float64   Float64    Float64   Float64   Float64   Float64   Int64  Int64 
─────┼─────────────────────────────────────────────────────────────────────────────────────
   1 β”‚ 0.480167  0.267154  0.0865253  0.261997  0.532114  0.66425   0.801299     10      0
   2 β”‚ 0.382722  0.323311  0.0674511  0.156531  0.19415   0.650376  0.963897     10      0

I’ll mark it as answered but still seems like there could be a one-liner like DataFrame([sx], [sy], ...)

Yeah it’s easy

DataFrame([sx, sx])
1 Like

Thought I tried that one… Thanks!