DataFrames: why is `df[2,]` the same as `df[2]`?

I’m not sure if it’s relevant, but I really like the Columns type from IndexedTables, where you can easily extract a column but also iterate on rows and both operation are efficient:

julia> using IndexedTables

julia> df = Columns(x = ["a", "b"], y = [1, 2])
2-element IndexedTables.Columns{NamedTuples._NT_x_y{String,Int64},NamedTuples._NT_x_y{Array{String,1},Array{Int64,1}}}:
 (x = "a", y = 1)
 (x = "b", y = 2)

julia> columns(df,:x)
2-element Array{String,1}:
 "a"
 "b"

julia> for i in df
           println(i)
       end
(x = "a", y = 1)
(x = "b", y = 2)

In particular the row iterator (of named tuples) is useful because it makes it very easy to select rows, which at the moment is a bit clumsy on a DataFrame without extra packages. I think it’d be really cool to have a unified interface with similar features in terms of column extraction and row iteration across DataFrames and Columns.

1 Like