It depends what you want to do:
- If you want to perform a single operation then it probably does not matter;
- If you want to do millions of such operations then:
- either use higher-level functions provided by DataFrames.jl like
selectorcombineand they will be efficient; - if you want to use low-level operations, like loops, then:
- if your data frame is not wide then convert it to
NamedTuplewithTables.columntable- this operation will be cheap and later all you do with it is type stable; - if your data frame is very wide but you do not need to process all columns then drop unneeded columns and do what I described in point above
- if your data frame is very wide and you need all columns then you have a problem - this is the case when writing type stable code is hard and you should rather consider using
combineorselectas they are optimized to efficiently handle such cases.
- if your data frame is not wide then convert it to
- either use higher-level functions provided by DataFrames.jl like
In summary - being type stable is not a free lunch as it heavily burdens the Julia compiler. DataFrames.jl was designed to be maximally flexible, but this means that it must be type unstable (otherwise you would not be able to e.g. dynamically add columns to a data frame). Also functions provided by DataFrames.jl were optimized to automatically “enable” type-stability of operations. Finally - as I have said - if your data is narrow then turning it to a type-stable NamedTuple is cheap.