String Index for DataFrames

I am using the DataFrames package, and I would like to set a string column of it as the index. For example, let x =

│ Row │ name   │ val│
│     │ String │ String │
├─────┼────────┼────────┤
│ 1   │ A      │ 1│
│ 2   │ B      │ 2│
│ 3   │ C      │ 4│
│ 4   │ D      │ 3│
│ 5   │ E      │ 4│
│ 6   │ X      │ 5│

Then, I want something like x[“A”, :val] to return 1. Currently, I have to do the following

@where(x, :name .== "A")[1, :val]

which is bulky and unclear that the @where should only return one row. Is there anything like pandas set_index or an alternative? I’m having trouble understanding setIndex! in the DataFrames documentation, but it states that indices must be integers in any case.

Unfortunately currently DataFrames.jl does not allow indexing. setindex! is a function defined in Base and is used for setting values in a DataFrame. What you can do is:

filter(row->row.name == "A", df).val

or

df.val[df.name .== "A"]

or

df[df.name .== "A", :val]

or some variant of these.

There is an open PR https://github.com/JuliaData/DataFrames.jl/pull/1908 that after merging will allow you to do the following:

gd = groupby(df, :name) # this kind-of creates an index on df by grouping it
df[("A",)].val # get group defined by "A" as a SubDataFrame and fetch :val column from it
4 Likes