Can indexes to DataFrame column be added to inprove selection performances?

In relational databases like mysql we have the option to add “indexes” to columns in order to increase selection performances (at cost of worsening them when editing the data).

Is there an option to add indexes to Julia DataFrames ? I found a very old topic about this, but not sure to what that lead (I didn’t get anything when then I did google for IndexedVectors).

1 Like

You can use a GroupedDataFrame for indexing. The lookups will be very fast, but it will only return subdataframes, even if all the indices are unique (indexing into a group will always return a one-row sub-dataframe).

This might be faster than just indexing with findfirst, depending on your use-case. It’s worth checking.

1 Like

You can also use AcceleratedArrays.jl for some columns.

1 Like