Using @filter of Tidier package inside a function

Hi!
I rencently discovered the package Tidier, but I having trouble to use it inside a function. For instance,

using AlgebraOfGraphics
using DataFrames

# Example function that filters a DataFrame `df` for rows where the column `col`
# (passed as a Symbol) satisfies some condition (here, greater than a threshold)
function filter_by_threshold(df::DataFrame, col::Symbol, threshold)
    @filter(df, col > threshold)
end

# Sample data
df = DataFrame(a = 1:10, b = 11:20)

# Using the function to filter rows where column :a is greater than 5
filtered_df = filter_by_threshold(df, :a, 5)
println(filtered_df)

Gives the error

ERROR: ArgumentError: column name :col not found in the data frame

How can I pass the column symbol to @filter?

Hey! See Interpolation - TidierData.jl. You need to use @eval @filter(df, $col > threshold) (currently recommended but long-term on the way out per @kdpsingh below) or @filter(df, !!col > threshold).

2 Likes

I hadn’t heard about Interpolation.
Thanks!

2 Likes

Great answer here by @eteppo.

I’ll add that the reason TidierData works like this is that all variables in macros operate in “data frame scope.” So when you reference col, by default TidierData assumes that you are referring to a column name named col.

If you want to refer to a value outside of the data frame, you have to use interpolation. TidierData was initially built with !! (bang-bang) as a lazy interpolation operator. Any variable you prefix with !! is assumed to refer to a value outside of the data frame. So in your example, @filter(df, !!col > threshold) should also work.

However, there are some places where !! currently doesn’t work because the TidierData parsing engine is in need of a refresh. In the meantime, the @eval-based solution is the recommended approach if the !! approach doesn’t work (although I suspect it will work for your use case).

The main downside to the @eval approach is that it introduces a slight compilation delay. Thus, in the long run, the goal will be to remove the need for @eval.

1 Like