I have a matrix with missing values and I want to calculate the column means. I’m not sure how to drop missing.
vec = [1, missing, 2]
mean(vec) # is missing
mean(skipmissing(vec) # give 1.5 that I want in 2 dim
# now I have a Matrix A with missing
A = [1 5
# unsure how to calculate column means
Statistics.mean(A, dims=1) # error for missing value
Statistics.mean(skipmissing(A), dims=1) # error
Everywhere I look I can only find vector examples… I don’t see any options in the mean function to drop missing in 2d. Any help would be appreciated.
skipmissing returns a linear iterator (one single axis) without allocating a copy:
julia> skipmissing(A) |> collect |> size
dims keyword does not work in that case because we have a single axis.
You can make an iterator that goes over each slice you care about (with
eachrow) and then broadcast
mean on the resulting iterator of single-axis arrays.
There might be a cleaner way to do it. The underlying issue is that skipmissing can not return a multi-axis array because different rows might need a different number of skips due to a different number of
Perfect this line worked:
(small edit: I needed column means so I do the transpose of my matrix of number of individuals by number (3k) of SNPs (45k))
Okay this was my worry, I hope the Statistics package improves this soon to deal with missing as this is more work than it needs to be (imo…). Thank you for your help!
Thank you very much for you suggestion here, I will read this over now.
There is also
eachcol which gives an iterator over columns. It does not really matter whether you transpose or whether you switch from
Oh shoot, I tried
eachcolumn() and didn’t work. Thanks I’ll use
Well, you lose some performance in
mean.(skipmissing.(eachcol(A))) compared to potential
mean(skipmissing(A), dims=1), but the former is more general: substitute any aggregation instead of
mean and it’ll work, without special support by the function.
Anyway, there’s a long-stalled PR linked from the issue above (https://github.com/JuliaLang/julia/pull/28027), so you may wish to update/promote it if this feature seems important.
Oh I see… Thank you for this information. Well then they must be aware, I’m not much for development, I’m still trying to learn the basics. Julia is kind of a beast compared to R to learn. Thanks for all your help.