Why is this dataframe filter slow?

The filter version is not type stable; on each iteration it needs to lookup the type of row[:a] and dispatch to the right code. The type stable version is

myrow = filter(:a => (x -> x == "a") , df)

or shorter,

myrow = filter(:a => ==("a") , df)

See Why DataFrame is not type stable and when it matters | Blog by Bogumił Kamiński for more on type stability with DataFrames.

10 Likes