We can allow ?
as a general purpose postfix operator, it doesn’t have to be only for missings. Then Missings.jl would just be one package making use of it.
It could look like this:
- Allow
?
as a postfix operator like'
(but without any method definition in Base, initially at least) - In Missings.jl define
?
as a shorthand forskipmissing
(this could also go in a new ShortMissings.jl package or whatever) - Fix
mean
to work well withskipmissing
(cf this comment) - Add overloads to
cor
, etc. to properly support things likecor(skipmissing(a), skipmissing(b))
- Add support for skipmissing-boolean indexing in DataFrames.jl
At this point we can do things like this:
mean(x?)
cor(x?, v?) # same behavior as polars or pandas
df[(df.x .> 0)?, :]
But let’s go further:
-
In Missings.jl define
?(f::Function)
to make wrappers that skip missing values:cor?(x,v)
would meanMissingSkipper(cor)(x,v)
which would eventually callcor(x?, v?)
It’s a nice shortcut but especially useful for cases like this:
combine(gdf, :value => mean?)
Here’s a working prototype with 'ˢ
instead of ?
:
struct MissingSkipper{T}
f::T
end
var"'ˢ"(x::AbstractArray) = skipmissing(x)
var"'ˢ"(f::Base.Callable) = MissingSkipper(f)
(s::MissingSkipper)(args...; kwargs...) =
(s.f)(skipmissing.(args)...; kwargs...)
x = [1, 2, missing, 3]
julia> mean(x'ˢ)
2.0
julia> mean'ˢ(x)
2.0
# would work if cor(x::SkipMissing, y::SkipMissing) was defined:
julia> cor'ˢ(x, x)