Learning to benchmark and find the best function to select a subset of a dataframe

Yes, I can answer it:

julia> x = rand(10^6);

julia> bx = [v < 0.5 for v in x];

julia> ix = findall(!, bx);

julia> @benchmark z[bx] setup=(z=copy(x)) evals=1
BenchmarkTools.Trial: 745 samples with 1 evaluation.
 Range (min … max):  4.486 ms … 28.198 ms  ┊ GC (min … max): 0.00% … 81.56%
 Time  (median):     4.569 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.939 ms ±  2.251 ms  ┊ GC (mean ± σ):  4.37% ±  8.00%

  █▇▄▄▃▂▁▂▁▁
  ██████████▇▇▇█▇▇▆▅▆▆▁▅▅▅▆▇▆▆▄▇▅▅▇▆▄▁▄▁▁▁▄▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▄ ▇
  4.49 ms      Histogram: log(frequency) by time     7.24 ms <

 Memory estimate: 3.82 MiB, allocs estimate: 2.

julia> @benchmark z[ix] setup=(z=copy(x)) evals=1
BenchmarkTools.Trial: 1398 samples with 1 evaluation.
 Range (min … max):  1.287 ms … 29.172 ms  ┊ GC (min … max):  0.00% … 94.65%
 Time  (median):     1.355 ms              ┊ GC (median):     0.00%
 Time  (mean ± σ):   1.699 ms ±  2.259 ms  ┊ GC (mean ± σ):  12.15% ±  8.70%

  ██▅▄▂▁▁                    ▁
  ███████▇▇█▇▆▄▇▅▄▄▄▄▄▁▄▄▄▄▄▆█▇▇▇▆▄▄▄▄▅▆▅▄▄▁▁▄▄▄▁▁▄▁▁▁▁▁▁▁▁▄ █
  1.29 ms      Histogram: log(frequency) by time     4.27 ms <

 Memory estimate: 3.81 MiB, allocs estimate: 2.

julia> @benchmark deleteat!(z, bx) setup=(z=copy(x)) evals=1
BenchmarkTools.Trial: 1764 samples with 1 evaluation.
 Range (min … max):  798.600 μs …   3.617 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     825.700 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   927.020 μs ± 304.150 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ██▄▂▂▂▁ ▁
  ██████████▇▇▇▆▅▅▅▇▅▃▃▅▅▇▁▄▅▃▁▄▁▃▁▃▁▁▁▁▄▃▁▁▁▁▁▃▃▁▄██▇▇▇▆▁▄▁▄▃▅ █
  799 μs        Histogram: log(frequency) by time       2.22 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark deleteat!(z, ix) setup=(z=copy(x)) evals=1
BenchmarkTools.Trial: 859 samples with 1 evaluation.
 Range (min … max):  3.794 ms …   6.477 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     3.828 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   3.976 ms ± 316.706 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▆▂▂▂▁▃▁  ▁
  ████████████▇██▇▆▅▆▇▅▅▇▆▇▄▅▆▅▆▅▆▆▅▁▅▄▄▅▅▄▅▅▄▄▆▅▁▅▄▁▅▅▅▅▁▁▁▅ ▇
  3.79 ms      Histogram: log(frequency) by time      5.21 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

And the current code in DataFrames.jl assumed that deleteat! and getindex had the same relationship in performance. Which clearly it has not. I will make a PR to fix it.