Does indexing over a subset of rows in a DataFrame return a view or a copy?

According to the manual, [:, ColumnName] returns a read-only view of the columns. Does it also work when indexing over a subset of the rows [my_rows_of_interest, :]? Does it matter if my_rows_of_interest is contiguous or not?

A copy.

You can do

@view df[my_rows_of_interest, :]

for a view.

1 Like

Thanks! If I were to do

return @view cached_df |> @filter(_.date >= from && _.date <= to)

in a function (referring to a global df “cached_df”), would it still be a view that is returned and used by the callee?

No that isn’t feasible.@view isn’t that smart.

There is a PR for this here. It would have a slightly different syntax.

Note that even after that PR you would only do view for DataFrames functions, not those defined in Query.

Given that cached_df is sorted w.r.t. :date, if I were to find the row numbers (which is a contiguous interval, because of the sorting) for which from <= :date <= to is satisfied, and instead write

return @view cached_df[row_start:row_end, :]

would that return a view to the callee? This link (https://juliadata.github.io/DataFrames.jl/stable/lib/indexing/), and also your initial response, seems to imply that, I just want to be sure that “return” does not do anything that changes that fact.

Correct. You will get a SubDataFrame back from the function. return doesn’t do anything special.