Force describe() not to omit columns

is it possible to force describe not to omit columns for printing?

1 Like

That currently isn’t possible, but the negative indexing is definitely on our radars. There has been discussion of implement something along the lines of Not(), Between() and regex matching from JuliaDB somewhere in DataFrames Ecosystem, which would make something like describe(select(df, Not(:a)) very easy to do.

A workaround for now would be

df[setdiff(names(df), vars_i_dont_want)] |> describe

Actually, I was unclear. I did not mean input but output columns. So I had meant output column suppression, not row suppression. I figured out that on linux, I can use stty cols 180 to make sure this does not happen.

the setdiff was nice, though…

regards, /iaw

Could you provide an MWE of what you want? Are talking about describe with a DataFrame?

yes, I was. unimportant for me now that I know how to fix it, but here is to explain what I mean:

julia> using DataFrames, Serialization; df= deserialize(open("sample-df.bin"));

julia> df
6×4 DataFrame
│ Row │ n1 │ n2   │ n3        │ n4  │
├─────┼────┼──────┼───────────┼─────┤
│ 1   │ 99 │ 9801 │ -0.999207 │ 'a' │
│ 2   │ 1  │ 1    │ 0.841471  │ 'b' │
│ 3   │ 3  │ 9    │ 0.14112   │ 'c' │
│ 4   │ 5  │ 25   │ -0.958924 │ 'd' │
│ 5   │ 7  │ 49   │ 0.656987  │ 'e' │
│ 6   │ 9  │ 81   │ 0.412118  │ 'f' │

julia> describe(df)
4×8 DataFrame
│ Row │ variable │ mean      │ min       │ median   │ max      │ nunique │ nmissing │ eltype  │
├─────┼──────────┼───────────┼───────────┼──────────┼──────────┼─────────┼──────────┼─────────┤
│ 1   │ n1       │ 20.6667   │ 1         │ 6.0      │ 99       │         │          │ Int64   │
│ 2   │ n2       │ 1661.0    │ 1         │ 37.0     │ 9801     │         │          │ Int64   │
│ 3   │ n3       │ 0.0155942 │ -0.999207 │ 0.276619 │ 0.841471 │         │          │ Float64 │
│ 4   │ n4       │           │ 'a'       │          │ 'f'      │ 6       │          │ Char    │

with stty cols 64:


julia> describe(df)
4×8 DataFrame. Omitted printing of 3 columns
│ Row │ variable │ mean      │ min       │ median   │ max      │
├─────┼──────────┼───────────┼───────────┼──────────┼──────────┤
│ 1   │ n1       │ 20.6667   │ 1         │ 6.0      │ 99       │
│ 2   │ n2       │ 1661.0    │ 1         │ 37.0     │ 9801     │
│ 3   │ n3       │ 0.0155942 │ -0.999207 │ 0.276619 │ 0.841471 │
│ 4   │ n4       │           │ 'a'       │          │ 'f'      │

read the docstring of describe, there is a stats keyword argument where you can specify which results you want.

I think even specifying all (stats = [:mean, :min, :median, :max, :nmissing, :nunique, :eltype])), it still excludes columns if the terminal is not wide enough.

One of the benefits of the new describe function is that its just a dataframe, so any method that prints all columns of any dataframe will work.

this might be a helpful link. I think show(describe(df), true) should work.

Though feel free to post an issue at DataFrames.

1 Like