Rationale for dropmissing vs skipmissing

DoktorMike · August 24, 2024, 1:34pm

As outlined in How to remove rows containing missing from DataFrame? the way to get rid of missings in a DataFrame is to use dropmissing. This works really well and I use it all the time, but it bugs me that I cannot use what would to me be more natural, i.e., skipmissing. Is there a logical reason I’m missing for why dropmissing had to be introduced instead of a specialized implementation for skipmissing for a DataFrame?

bkamins · August 24, 2024, 3:18pm

skipmissing has a different indexing rule:

julia> x = [1, 2, missing, 4, 5]
5-element Vector{Union{Missing, Int64}}:
 1
 2
  missing
 4
 5

julia> y = skipmissing(x)
skipmissing(Union{Missing, Int64}[1, 2, missing, 4, 5])

julia> y[2]
2

julia> y[3]
ERROR: MissingException: the value at index (3,) is missing

julia> y[4]
4

DoktorMike · August 24, 2024, 6:52pm

Thanks for the explanation.

Topic		Replies	Views
How to remove rows containing missing from DataFrame? New to Julia	6	13200	July 22, 2019
Problems about dealing with missing values, maybe connected to DataFrames.jl Data question	4	759	December 4, 2018
Possible bug in dropmissing! General Usage	7	1153	June 4, 2019
How do I drop only rows that are fully filled with missing values? General Usage question , package , dataframes	3	182	January 24, 2023
Changing missing logic General Usage question	6	466	December 18, 2019

Rationale for dropmissing vs skipmissing

Related topics