Why are missing values not ignored by default?

pdeffebach · November 27, 2023, 5:52pm

It’s the data analyst’s job to ensure data integrity. The question “which observations contribute to this statistic” is something Julia, the language, can’t answer. The analyst should absolutely conduct additional robustness checks about how missing values are handled and what’s the appropriate way to deal with them.

The question is whether imposing skipmissing(...) or propagation on Boolean operations is the the right way to go about that. It’s costly for users to write skipmissing every time they wish to calculate the mean. I’m simply making an argument that the cost isn’t always worth the benefits.

Topic		Replies	Views
What workflows for missing values are more ergonomic in Julia? Internals & Design	2	375	November 30, 2023
How does StatsBase.skewness work? Data	29	2626	January 29, 2019
A modest `missing`s 2.0 proposal Data	20	1197	October 31, 2020
Missing or NaN General Usage	26	12340	August 1, 2018
DataFramesMeta.jl and the state of the DataFrames ecosystem Data	36	4028	April 24, 2020

Why are missing values not ignored by default?

Related topics