DataFrames: obtaining the subset of rows by a set of values

Unfortunately that solution doesn’t fly when there are many columns with heterogeneous types:

julia> df = DataFrame(rand(10000, 100));

julia> df.a = 'a'
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

julia> @time _filter(x -> x.x1 > 0.5, df);
  0.809720 seconds (3.87 M allocations: 132.545 MiB, 12.98% gc time)

julia> @time filter(x -> x.x1 > 0.5, df);
  0.105625 seconds (2.52 M allocations: 67.293 MiB, 10.89% gc time)

My current thinking is that the ideal interface would be something like filter(x1 -> x1 > 0.5, df), and we would extract the names of the arguments to identify which variables (here x1) are actually used. That would avoid problems with too large numbers of columns and would offer a compact syntax.