Suppose I have a DataFrame df
which was generated by a Dictionary which may include whitespace in the keys
using DataFrames
myDict = Dict()
myDict["aKey"] = 1:10
myDict["anotherKey"] = zeros(10)
myDict["a_nice_key"] = rand(Bool, 10)
myDict["a nasty key"] = fill("oof", 10)
df = DataFrame(myDict)
If I wanted to select the colomns a_nice_key
and anotherKey
where aKey
is less than say 4, this is easy to do with Query.jl
using Query
x = @from i in df begin
@where i.aKey < 4
@select {i.a_nice_key, i.anotherKey}
@collect DataFrame
end
What I cannot figure out is how to select elements from the a nasty key
column since its symbol is not simple. For example, this does not work
x = @from i in df begin
@where i.aKey < 4
@select i[!, Symbol("a nasty key")]
@collect DataFrame
end
Is there some way I can work around this? I have tried various permutations but cannot find any way to access these columns in my DataFrame.
I get into the same trouble with the DataFramesMeta package (resetting the REPL to clear the name conflicts with Query.jl)
using DataFramesMeta, Lazy
x = @> begin
df
@where(:aKey .< 4)
@select(:a_nice_key, Symbol("a nasty key"))
end
(Interestingly enough, select
works with one string turned into a Symbol
, but no more than one).
Finally, in neither the Query not DataFramesMeta packages can I figure out how to splat a predefined list of column names to select. For example, with DataFramesMeta:
colsOfInterest = [:anotherKey, :a_nice_key]
x = @> begin
df
@where(:aKey .< 4)
@select(colsOfInterest...)
end
fails. Is there a way to resolve both these two issues in either one of these (or another querying) package?