Byrow with user defined function

for data like

ds=Dataset(x1=[1,2,3],x2=[3,2,1])

if I want to filter rows where x1 is less than x2 then this works

filter(ds, :x1, type=isless, with = :x2)

but when I define a function for byrow like

f(x)=x[1]< x[2]

byrow(ds, f, :) # works

I cannot use it with filter

filter(ds, :, type=f)
ERROR: BoundsError: attempt to access 3×2 Dataset at index [Union{Missing, Bool}[true, false, false], :]
Stacktrace:

after reading that
" … filter(ds, cols; [view = false, type = all,...]) is the shortcut for ds[byrow(ds, type, cols; ...), :] …"

julia> byrow(ds, f, :)# works
3-element Vector{Union{Missing, Bool}}:
  true
 false
 false

I have tried the following expressions, which I would expect to give the same result.
but it is not so, as rightly observed by @monopolynomial

julia> ds[[true,false,false],:]
1×2 Dataset
 Row │ x1        x2       
     │ identity  identity
     │ Int64?    Int64?
─────┼────────────────────
   1 │        1         3

julia> ds[byrow(ds, f, :),:]
ERROR: BoundsError: attempt to access 3×2 Dataset at index [Union{Missing, Bool}[true, false, false], :]

" Naturally, other fun s supported by byrow which return a Vector{Bool} or BitVector can be used to filter observations, too."

the problem therefore seems to be that byrow (...) only works with Vector{Bool} but not with Vector{Union{Missing, Bool}}

this “works”

julia> ds[Bool.(byrow(ds, f, :)),:]
1×2 Dataset
 Row │ x1        x2       
     │ identity  identity 
     │ Int64?    Int64?   
─────┼────────────────────
   1 │        1         3

the issue seems to be the handling of Union {Missing, T} types

julia> ds[Bool.(byrow(ds, f, :)),:]

This doesn’t work if f returns missing.

Yes.
I have seen and, in part, I had even foreseen it.
But I didn’t want to linger.
Perhaps, in the general case, a function would be needed to establish on a case-by-case basis whether missing is to be evaluated as true or false (or simply remain missing)…
Maybe a kwarg that can be set by the user as needed could do the trick.

findall is good for this.

A post was split to a new topic: Help with understanding Julia types