is it possible to force describe not to omit columns for printing?
That currently isnβt possible, but the negative indexing is definitely on our radars. There has been discussion of implement something along the lines of Not()
, Between()
and regex matching from JuliaDB somewhere in DataFrames Ecosystem, which would make something like describe(select(df, Not(:a))
very easy to do.
A workaround for now would be
df[setdiff(names(df), vars_i_dont_want)] |> describe
Actually, I was unclear. I did not mean input but output columns. So I had meant output column suppression, not row suppression. I figured out that on linux, I can use stty cols 180
to make sure this does not happen.
the setdiff was nice, thoughβ¦
regards, /iaw
Could you provide an MWE of what you want? Are talking about describe with a DataFrame?
yes, I was. unimportant for me now that I know how to fix it, but here is to explain what I mean:
julia> using DataFrames, Serialization; df= deserialize(open("sample-df.bin"));
julia> df
6Γ4 DataFrame
β Row β n1 β n2 β n3 β n4 β
βββββββΌβββββΌβββββββΌββββββββββββΌββββββ€
β 1 β 99 β 9801 β -0.999207 β 'a' β
β 2 β 1 β 1 β 0.841471 β 'b' β
β 3 β 3 β 9 β 0.14112 β 'c' β
β 4 β 5 β 25 β -0.958924 β 'd' β
β 5 β 7 β 49 β 0.656987 β 'e' β
β 6 β 9 β 81 β 0.412118 β 'f' β
julia> describe(df)
4Γ8 DataFrame
β Row β variable β mean β min β median β max β nunique β nmissing β eltype β
βββββββΌβββββββββββΌββββββββββββΌββββββββββββΌβββββββββββΌβββββββββββΌββββββββββΌβββββββββββΌββββββββββ€
β 1 β n1 β 20.6667 β 1 β 6.0 β 99 β β β Int64 β
β 2 β n2 β 1661.0 β 1 β 37.0 β 9801 β β β Int64 β
β 3 β n3 β 0.0155942 β -0.999207 β 0.276619 β 0.841471 β β β Float64 β
β 4 β n4 β β 'a' β β 'f' β 6 β β Char β
with stty cols 64:
julia> describe(df)
4Γ8 DataFrame. Omitted printing of 3 columns
β Row β variable β mean β min β median β max β
βββββββΌβββββββββββΌββββββββββββΌββββββββββββΌβββββββββββΌβββββββββββ€
β 1 β n1 β 20.6667 β 1 β 6.0 β 99 β
β 2 β n2 β 1661.0 β 1 β 37.0 β 9801 β
β 3 β n3 β 0.0155942 β -0.999207 β 0.276619 β 0.841471 β
β 4 β n4 β β 'a' β β 'f' β
read the docstring of describe
, there is a stats
keyword argument where you can specify which results you want.
I think even specifying all (stats = [:mean, :min, :median, :max, :nmissing, :nunique, :eltype])), it still excludes columns if the terminal is not wide enough.
One of the benefits of the new describe
function is that its just a dataframe, so any method that prints all columns of any dataframe will work.
this might be a helpful link. I think show(describe(df), true)
should work.
Though feel free to post an issue at DataFrames.