for what i read here, when you have an x = Vector{Union{Missing,Float64}}
a second array is created, of the same size, holding information of when the value at position x[i] is missing or not. using the approach would imply storing that information somewhere, with the subsequent cost.
The answer to your question is mencioned explicitly:
One of Julia’s strengths is that user-defined types are as powerful and fast as built-in types. To fully take advantage of this, missing values had to support not only standard types like
Int
,Float64
andString
, but also any custom type. For this reason, Julia cannot use the so-called sentinel approach like R and Pandas to represent missingness, that is reserving special values within a type’s domain. For example, R represents missing values in integer and boolean vectors using the smallest representable 32-bit integer (-2,147,483,648
), and missing values in floating point vectors using a specificNaN
payload (1954
, which rumour says refers to Ross Ihaka’s year of birth). Pandas only supports missing values in floating point vectors, and conflates them withNaN
values.
In order to provide a consistent representation of missing values which can be combined with any type, Julia 0.7 will usemissing
, an object with no fields which is the only instance of theMissing
singleton type. This is a normal Julia type for which a series of useful methods are implemented. Values which can be either of typeT
or missing can simply be declared asUnion{Missing,T}
. For example, a vector holding either integers or missing values is of typeArray{Union{Missing,Int},1}
: