I’m not sure if it’s relevant, but I really like the Columns type from IndexedTables, where you can easily extract a column but also iterate on rows and both operation are efficient:
julia> using IndexedTables
julia> df = Columns(x = ["a", "b"], y = [1, 2])
2-element IndexedTables.Columns{NamedTuples._NT_x_y{String,Int64},NamedTuples._NT_x_y{Array{String,1},Array{Int64,1}}}:
(x = "a", y = 1)
(x = "b", y = 2)
julia> columns(df,:x)
2-element Array{String,1}:
"a"
"b"
julia> for i in df
println(i)
end
(x = "a", y = 1)
(x = "b", y = 2)
In particular the row iterator (of named tuples) is useful because it makes it very easy to select rows, which at the moment is a bit clumsy on a DataFrame without extra packages. I think it’d be really cool to have a unified interface with similar features in terms of column extraction and row iteration across DataFrames and Columns.