Arithmetic operations on multi-dimensional arrays with Missings


#1

Given arrays with missing values, I am unable to workout the mean of a mutli-dimensional array over a given region

For example,

x = convert(Array{Union{Float64,Missing}}, rand(2,2))
x[1] = missing
mean(skipmissing(x))

works as expected. However computing the mean over a given region, say mean(skipmissing(x), 2) returns an error indicating that the relevant method has not been defined.

Other than writing my own method (which I cannot do at the moment), is there a way to calculate the mean of a mutli-dimensional array with missing values over a given region?


#2

Perhaps something like

julia> x
2×2 Array{Union{Missing, Float64},2}:
  missing  0.848082
 0.125747  0.481677

julia> [mean(skipmissing(x[i, :])) for i in 1:size(x, 2)]
2-element Array{Float64,1}:
 0.8480819826358683
 0.3037118716331342

julia> hcat([mean(skipmissing(x[i, :])) for i in 1:size(x, 2)])
2×1 Array{Float64,2}:
 0.8480819826358683
 0.3037118716331342

where the hcat call is used to make the result into a Matrix (just like mean(x, 2) does) instead of a Vector.


#3

Or using mapslices:

mapslices(xi->mean(skipmissing(xi)),x,1)

although this is probably slower than a hand-coded function.


#4

@fredrikekre, @fabiangans: both solutions worked. Thanks


#5

I just submitted a PR to incorporate a generic version of this into base. https://github.com/JuliaLang/julia/pull/27818