Boolean Indexing in dataframes

@bkamins How about sampling the dataframe by boolean values. I mean take every column for the row, the the boolean is true?

Like this example:

julia> df = DataFrame(rand(3,3), :auto)
3×3 DataFrame
 Row │ x1         x2        x3       
     │ Float64    Float64   Float64  
─────┼───────────────────────────────
   1 │ 0.045519   0.468771  0.387336
   2 │ 0.0133922  0.383619  0.418809
   3 │ 0.870746   0.898979  0.628106

For example: I want to get every dataframe, at every row where x1 is larger 0.02

I’m not sure I follow the original question, but for the example you can do

df[df.x1 .> 0.02, :]

to select the subset of rows satisfying the desired criterion.

Alternatively, you can use the subset function:

subset(df, :x1 => ByRow(>(0.02)))
3 Likes

In DataFramesMeta.jl, this is simply

@rsubset df :x1 > .02
2 Likes

I think maybe you want this:

using Tidier,DataFrames
df = DataFrame(a = [1, 2, missing, 4, 5])

@filter

@chain df begin
   @filter(a >=2))
end

Conditionals

@chain df begin
  @mutate(a = if_else(a >= 2, true, false))
end