Adding two Array{Union{Missing,Float64}} returns Array{Float64}, is this by design?

viraltux · October 9, 2021, 11:20am

y = Array{Union{Missing,Float64}}(undef,100);
y[:] = rand(100)
ϵ = Array{Union{Missing,Float64}}(undef,100);
ϵ[:] = rand(100)

julia> eltype(y.+ϵ)
Float64

I am not sure but I imagine the idea behind doing this is returning the type that simplifies the array and will perform best after an operation.

Whatever the reason, I wonder what would be the best way to keep the type Union{Missing,T} when operating with Arrays regardless whether the Array contains missing values or not.

rafael.guerra · October 9, 2021, 11:46am

Isn’t that what happens if the assignments are made using dots?

y .= rand(100)

viraltux · October 9, 2021, 11:59am

Oh! that’s nice!

rafael.guerra · October 9, 2021, 12:00pm

The types do not seem to be preserved after the sum (Julia 1.6), though.

viraltux · October 9, 2021, 12:01pm

Yeah, that’s a bit of a problem, I was wondering what was the Julia way to preserve the type with missing values.

rafael.guerra · October 9, 2021, 12:04pm

Not sure if this is recommended:
typeof(y)(y+ϵ)
or:
Union{typeof(y),typeof(ϵ)}(y+ϵ)

viraltux · October 9, 2021, 12:05pm

In the package Missing.jl they use convert but I was wondering if there was a better way

Sukera · October 9, 2021, 12:07pm

It’s because your arrays don’t actually contain any missing values. If you do y[end] = missing, the return type is Vector{Union{Missing, Float64}}:

julia> y[end] = missing
missing

julia> typeof(y .+ ϵ)
Vector{Union{Missing, Float64}}

I haven’t checked what happens in a function, but I suspect this is a global-scope only thing.

pdeffebach · October 9, 2021, 12:15pm

allowmissing in Missings.jl would be the best way.

Seems to exist in functions too.

I think this makes sense from the design of broadcasting. But it is annoying.

viraltux · October 9, 2021, 12:23pm

Yeah, I know, but problem is that if I do z = y .+ ϵ then z[end] = missing throws an error, which is what I want to avoid.

This is what allowmissing does in Missing.jl:

allowmissing(x::AbstractArray{T}) where {T} = convert(AbstractArray{Union{T, Missing}}, x)

I could convert directly instead loading a package but I was wondering if this is supposed to be the default behavior, I am asking because for certain algorithms I will have to constantly convert for every single operation I do with Arrays capable of containing missing values.

jling · October 9, 2021, 12:42pm

The right-hand side allocates a new vector, as such, its type may need shrinking or widening. In this case, the algorithm decided shrinking is useful (makes sense, because you didn’t have any missing to begin with)

viraltux · October 9, 2021, 12:47pm

I would say in analytics that’s debatable; precisely the whole point of defining Array{Union{Missing,T}} is because I am planing to insert missing values but I dot not want any type conversion every time my Arrays happen not to have any in an operation.

The workaround I am considering is using NaN isntead missing since NaN isa Number is true and I would not run into this kind of problems.

jling · October 9, 2021, 12:50pm

then you should be doing

y .+= ϵ

instead? and keep using y since that’s your “pre-allocated vector”

viraltux · October 9, 2021, 12:53pm

That’s nice, as long as I don’t need y for anything else…

rafael.guerra · October 9, 2021, 12:54pm

z = copy(y)
z .= y + ϵ

jling · October 9, 2021, 12:55pm

well, you can also z = (copy(y) .+= ϵ)

viraltux · October 9, 2021, 12:56pm

This one perhaps could be faster with

z = similar(y)
z .= y .+ ϵ

viraltux · October 9, 2021, 12:57pm

Missing values are so important that I wonder if we could ask for them to be a subtype of Number.

jling · October 9, 2021, 1:02pm

definitely not

viraltux · October 9, 2021, 1:43pm

Very useful explanation

Topic		Replies	Views
Question about internal representation of Union{Missing, Float64} Performance question , array , memory	2	1032	October 31, 2018
Efficient way to transform Array{Union{Missing,Float64}} to Array{Float64}? General Usage	11	4191	September 6, 2019
Typestability in matrix multiplication, intended behaviour or bug? General Usage question	10	1024	January 24, 2019
How to define an array to contain Union{Missing, T}? New to Julia parametric-types	4	504	October 15, 2021
Proper way to initiate an array with `missing` General Usage question	10	2407	March 26, 2022

Adding two Array{Union{Missing,Float64}} returns Array{Float64}, is this by design?

Related topics