Boxplot with NaN / missing / Inf in data?

Hello everyone,

I’m struggling at the moment with boxplots. My data has several data points and many of them is NaN or Inf or missing. I was wondering if there is a way to just ignore these values without trying to edit data itself?

A simple example can be like this:

boxplot([1, 2, 3, NaN, 20, 1, 2, 7, 3])


ERROR: ArgumentError: quantiles are undefined in presence of NaNs or missing values
  [1] _quantilesort!(v::Vector{Float64}, sorted::Bool, minp::Float64, maxp::Float64)
    @ Statistics ~/.julia/juliaup/
  [2] #quantile!#49
    @ ~/.julia/juliaup/ [inlined]
  [3] quantile!
    @ ~/.julia/juliaup/ [inlined]
  [4] quantile(v::Vector{…}, p::StepRangeLen{…}; sorted::Bool, alpha::Float64, beta::Float64)
    @ Statistics ~/.julia/juliaup/
  [5] macro expansion
    @ ~/.julia/packages/StatsPlots/cStOe/src/boxplot.jl:41 [inlined]
  [6] apply_recipe(plotattributes::AbstractDict{Symbol, Any}, ::Type{Val{:boxplot}}, x::Any, y::Any, z::Any)
    @ StatsPlots ~/.julia/packages/RecipesBase/BRe07/src/RecipesBase.jl:300
  [7] _process_seriesrecipe(plt::Any, plotattributes::Any)
    @ RecipesPipeline ~/.julia/packages/RecipesPipeline/BGM3l/src/series_recipe.jl:50
  [8] _process_seriesrecipes!(plt::Any, kw_list::Any)
    @ RecipesPipeline ~/.julia/packages/RecipesPipeline/BGM3l/src/series_recipe.jl:27
  [9] recipe_pipeline!(plt::Any, plotattributes::Any, args::Any)
    @ RecipesPipeline ~/.julia/packages/RecipesPipeline/BGM3l/src/RecipesPipeline.jl:99
 [10] _plot!(plt::Plots.Plot, plotattributes::Any, args::Any)
    @ Plots ~/.julia/packages/Plots/a3u1v/src/plot.jl:223
 [11] plot(args::Any; kw...)
    @ Plots ~/.julia/packages/Plots/a3u1v/src/plot.jl:102
 [12] boxplot(args::Any; kw...)
    @ Plots ~/.julia/packages/RecipesBase/BRe07/src/RecipesBase.jl:427
 [13] top-level scope
    @ Untitled-1:282
Some type information was truncated. Use `show(err)` to see complete types.

The easiest thing would be to filter the data, either using an iterator or a by making a new list. For missing values, there is skipmissing exactly for that.

skipmissing([1,2,missing,3]) # return an iterator

From the iterator you can create a list using collect


That’s only for missing values.
If you want to filter more, you need to write your own filter.

julia> filter(x-> x !== NaN, [1, 3, NaN])
2-element Vector{Float64}:

Prepend that with the Iterators module to get an iterator instead.

julia> Iterators.filter(x-> x !== NaN, [1, 3, NaN])
Base.Iterators.Filter{var"#7#8", Vector{Float64}}(var"#7#8"(), [1.0, 3.0, NaN])

You also might be interested in Skipper.jl
It short of generilizes the skipmissing for custom predicates.