# Ignoring NaN in elementwise aggregations

I would like to do calculations like

``````mean(a_bunch_of_similar_arrays)
``````

but with `NaN`s quietly ignored.

MWE with an inefficient and inelegant solution:

``````A1 = [NaN 1; 2 3]

A2 = [1 NaN; 2 3]

As = [A1, A2]

function elementwise_ignoreNaN(f, arrays)
ax = axes(first(arrays))
I = CartesianIndices(ax)
@assert all(a -> axes(a) == ax, arrays)
map(I) do ι
x = filter(!isnan, [a[ι] for a in arrays])
f(x)
end
end

elementwise_ignoreNaN(mean, As) # [1 1; 2 3]
``````

but I wonder if this functionality exists already in some package, or if I could do this better.

2 Likes

Any reason not to use broadcasting? E.g. something similar to

``````julia> nanless(f, a...) = f(filter(!isnan, a))
nanless (generic function with 1 method)

julia> nanless.(mean, A1, A2)
2×2 Array{Float64,2}:
1.0  1.0
2.0  3.0
``````

I have about 5000 such arrays (sorry, should have said), and I am not sure that broadcasting is the good match for that.

Could `skipmissing` help if you use `missing` instead of `NaN`? For some reason it doesn’t seem to work on your MWE but it promises to be the efficient way to filter missing values in calculations.

Maybe you are asking for a package that does this out-of-the-box, but I think using `LazyArrays.@~` to create `Broadcasted` objects is a neat approach:

``````xs = [[NaN 1; 2 3], [1 NaN; 2 3]]

add!!(a, b) = a .+ b
add!!(a::AbstractArray, b) = a .+= b
sumpw(f, xs) = mapreduce(f, add!!, xs)

using LazyArrays: @~
fillnan(x) = isnan(x) ? zero(x) : x
sumpw(x -> (@~ fillnan.(x)), xs) ./
sumpw(x -> (@~ .!isnan.(x)), xs)
``````

With this approach, you don’t have to allocate intermediate arrays like `fillnan.(x)`. The array allocated by materializing `fillnan.(xs) +. fillnan.(xs)` is used for a entire base case of the reduction. You can also use `mapfoldl` instead of `mapreduce` to minimize the allocation.

(I’ll probably add `add!!` to BangBang.jl so that above code would be more generic.)

This is also easily parallelizable by `using Transducers; sumpw(f, xs) = reduce(add!!, Map(f), xs)`.

2 Likes

Have you had a look at NaNMath.jl?

2 Likes

Thanks for the suggestion - NaNMath.jl should simplify the filter/reduce step. I still need a neat way to map a bunch of arrays elementwise though.

I don’t think it is appropriate for my use case, since the objects being skipped are elements of arrays in a vector, not elements of a vector.

It is indeed neat (once I understand it ), but conceptually isn’t it just

``````As = [[NaN 1; 2 3], [1 NaN; 2 3]]
sum(A -> map(x -> isnan(x) ? zero(x) : x, A), As) ./ sum(A -> map(!isnan, A), As)
``````

That’s right. The reason why I thought it was neat was that `@~`-based implementation allocates arrays only when necessary; i.e., thanks to the clever design of the broadcasting machinery, `fillnan.(x)` and `a .+ b` are fused even though those expressions are not syntactically in the same “dot call expression.”

1 Like