Queryverse: How to @select columns with specific eltype? (using `oftype` function)

Dear all Queryverse users.

In an auxilary package QueryOperators.NamedTupleUtilities, there is a helper function named oftype. I guess it is related with selecting columns with specific eltype. But I cannot figure out how to use this function.

In the case of very similar function startswith(), it can be used easily in the following way…

df = DataFrame(foo=[1,2,3], bar=[3.0,2.0,1.0], bat=["a","b","c"])

df |> @select(startswith("b")) |> DataFrame

It produces result as expected.

3Γ—2 DataFrame
β”‚ Row β”‚ bar     β”‚ bat    β”‚
β”‚     β”‚ Float64 β”‚ String β”‚
β”‚ 1   β”‚ 3.0     β”‚ a      β”‚

However, the following command does not work. It is awkward since the function signatures of these two functions are so similar.

@generated function startswith(a::NamedTuple{an}, ::Val{bn}) where {an, bn}
@generated function oftype    (a::NamedTuple{an}, ::Val{b}) where {an, b}

df |> @select(oftype(Float64)) |> DataFrame

ERROR: ArgumentError: 'QueryOperators.EnumerableMap{NamedTuple{,Tuple{}},QueryOperators.EnumerableIterable{NamedTuple{(:foo, :bar, :bat),Tuple{Int64,Float64,String}},Tables.DataValueRowIterator{NamedTuple{(:foo, :bar, :bat),Tuple{Int64,Float64,String}},Tables.RowIterator{NamedTuple{(:foo, :bar, :bat),Tuple{Array{Int64,1},Array{Float64,1},Array{String,1}}}}}},getfield(Main, Symbol("##158#160"))}' iterates 'NamedTuple{,Tuple{}}' values, which don't satisfy the Tables.jl Row-iterator interface

There are other functions in this package(QueryOperators.NamedTupleUtilities). It seems that these functions provide functionalities similar to tidyverse’s select_if function.

Anyone can help me to master how to use these functions???

This is broken right now, but I just finished some PRs that should fix this. Will probably take a bit until they are reviewed and we’ve tagged a new release.

Once they are merged, the syntax to use this is @select(::Number, ::String) etc.

Ok, I just released a version of Query.jl where this now works. Try

df |> @select(::Float64) |> DataFrame

It also works with abstract super types, for example to select all numeric columns (int and float):

df |> @select(::Number) |> DataFrame

Thanks for prompt replies and nice solutions…

1 Like