I have a Dict{Symbol,Vector} and want to create a DataFrame out of it. Itβs easy enough to do DataFrame(dict) but I also want the columns to appear in the right order. For example,
Since I have the proper order in x[:symbols], I could rearrange it as below. Itβs a bit ugly so Iβm wondering if thereβs a better wayβ¦ Perhaps DataFrames.jl should have another constructor.
I think that the Pair{Symbol,Vector}... constructor should do what you want, ie
using DataFrames
cols = Dict(:foo => [1,2,3], :bar => [4,5,6], :mouse => [7,8,9])
order = [:foo, :bar, :mouse]
DataFrame([name => cols[name] for name in order]...)
otherwise the DataFrames(columns, cnames) constructor you are using is fine.
Thanks @Tamas_Papp . I havenβt thought about that one.
I still hope that DataFrame would have another constructor that just takes a Dict and an array of keys of the specified order. Perhaps Iβll submit a PR there
When conversions are relatively easy, I donβt think it is a good idea to provide constructors for cases like this (there are many similar ones).
All the constructor would do is apply the same one-liners as above, with the cost of increasing code complexity and maintenance burden for the package.
Looks like what is really needed here is an ordered dictionary type which would preserve the order in which you want the keys to appear. Adding DataFrame constructors isnβt really the appropriate solution.
@Tamas_Papp and @bkamins, Iβm not picking on you guys but I humbly disagree. As a package writer, I believe the interface needs to be as user friendly as possible. Am I the only one who has this need? If everyone like me is writing the same one-liner then it would make more sense put that one line in the package.
I like @nalimilanβs suggestion, however. If we have have a keys() function that takes an OrderedDict object and returns the natural order of the keys, then we can bypass the auto-sorting feature in DataFrames. Hence, we donβt need to burden DataFrames package with the additional constructor.
For reference, DataFrames.jlβs code look like this: :
function Base.convert(::Type{DataFrame}, d::Associative)
colnames = keys(d)
if isa(d, Dict)
colnames = sort!(collect(keys(d)))
else
colnames = keys(d)
end
colindex = Index(Symbol[k for k in colnames])
columns = Any[d[c] for c in colnames]
DataFrame(columns, colindex)
end
I agree that in cases like unambigous conversions such as
DataFrame(dict::OrderedDict)
it would be nice to provide constructors (although it does raise an dependency issue).
However, as a DataFrames user I mostly agree with @Tamas_Papp and @bkamins. I donβt see a need to try to guess every possible use of a constructor when a simple and efficient bit of code like a list comprehension would suffice. After all, one of the primary reasons to use a language like Julia as opposed to a language like C++ is that thereβs a lot of code which itβs very simple to write so we donβt require anywhere near as many specialized functions.