Adding new column to DataFrame with Query.jl

query
dataframes

#1

Is there a way to use Query to just add a new column? In the docs there’s this example:

using Query, DataFrames

df = DataFrame(name=["John", "Sally", "Kirk"], age=[23., 42., 59.], children=[3,5,2])

x = @from i in df begin
    @select {i.name, Age=i.age}
    @collect DataFrame
end

but I’d like to do something like:

x = @from i in df begin
    @select {i, age_in_days=i.age*365}
    @collect DataFrame
end

to be equivalent to

x = @from i in df begin
    @select {i.name, i.age, i.children, age_in_days=i.age*365}
    @collect DataFrame
end

But what I get is:

3×2 DataFrames.DataFrame
│ Row │ i                                          │ new_age │
├─────┼────────────────────────────────────────────┼─────────┤
│ 1   │ (name = "John", age = 23.0, children = 3)  │ 8395.0  │
│ 2   │ (name = "Sally", age = 42.0, children = 5) │ 15330.0 │
│ 3   │ (name = "Kirk", age = 59.0, children = 2)  │ 21535.0 │

I’ve got a giant table, so @selecting every column individually would be tedious. I know I can also just make a new vector and add it, but I’m doing a bunch of other filtering operations at the same time, and it would be nice to do them all at once.


#2

Not right now, unfortunately. I’ve got a complete design on how to do this figured out, but it will only work with the new named tuples in julia 0.7, so we’ll have to wait until that is ready. The syntax will be {i..., age_in_days=i.age*365} once I get around to implementing this.


#3

@davidanthoff Awesome - that’s exactly what I’d expect for syntax (I actually tried that before writing here). Understand that it’s not implemented yet - I’ll just do it in two steps for now.

Thanks!