DataFrame row to Dict

I’d like to query a row of a DataFrame and return the result as a Dict. I was able to do it but the code isn’t pretty (converting Python programmer here). Here is a MWE:

Input DataFrame:

DataFrame(name=["John", "Sally", "Roger"],
                             age=[54., 34., 79.],
                             children=[0, 2, 4])
3×3 DataFrame
│ Row │ name   │ age     │ children │
│     │ String │ Float64 │ Int64    │
├─────┼────────┼─────────┼──────────┤
│ 1   │ John   │ 54.0    │ 0        │
│ 2   │ Sally  │ 34.0    │ 2        │
│ 3   │ Roger  │ 79.0    │ 4        │

Desired output:

Dict{String,Real} with 2 entries:
  "age"      => 54.0
  "children" => 0

Here is my code:

import DataFrames, Query

df = DataFrame(name=["John", "Sally", "Roger"],
                      age=[54., 34., 79.],
                      children=[0, 2, 4])

john = @from i in df begin
    @where i.name == "John"
    @select {i.age, i.children}
    @collect DataFrame
end
# Query the first row of the returned dataframe
john = john[1,:]

johnDict = Dict("age" => john.age, "children" => john.children)

Thanks in advance! I’m new to data-wrangling in Julia.

Dict(names(john) .=> values(john))

Why do you need a dict, though? a DataFrameRow can do everything a dict can.

4 Likes

Well, true. Because I’m trying to learn the syntax I guess (and I hate being stumped).

Thanks for help. Is there a one-liner for the DataFrame query as well? What I have seems excessive.

Thanks again, Sam

I’m not too familiar with Query, but that seems like the correct code

You can call first on a DataFrame to get the first row.

I don’t use Query much either but that does seem correct. You could also use Underscores.jl to pipe things from start to finish as well (you use two underscores to refer to the table). It isn’t really any shorter though.

using DataFrames, Underscores

df = DataFrame(name=["John", "Sally", "Roger"],
                      age=[54., 34., 79.],
                      children=[0, 2, 4])

john = @_ df |> filter(:name => isequal("John"),__) |> select(__,:age,:children) |> first

To add on, this is the solution using just DataFrames and Pipe

julia> df = DataFrame(name=["John", "Sally", "Roger"],
                             age=[54., 34., 79.],
                             children=[0, 2, 4])
3×3 DataFrame
│ Row │ name   │ age     │ children │
│     │ String │ Float64 │ Int64    │
├─────┼────────┼─────────┼──────────┤
│ 1   │ John   │ 54.0    │ 0        │
│ 2   │ Sally  │ 34.0    │ 2        │
│ 3   │ Roger  │ 79.0    │ 4        │

julia> using Pipe

julia> @pipe df |>
       filter(r -> r.name == "John", _) |>
       select(_, :age, :children) |>
       first(_)
DataFrameRow
│ Row │ age     │ children │
│     │ Float64 │ Int64    │
├─────┼─────────┼──────────┤
│ 1   │ 54.0    │ 0        │
1 Like