Conditional statistics over 1 dimension of a multidimensional array

I would like to compute conditional statistics over one dimension of a two-dimensional array. I have an array a and I would like to compute a mean over the first dimension, taking into account only the cells where b < 0.5. With numpy this is easy with masked arrays, but I haven’t found an Julia equivalent for this. What is the Julian way of solving this?

using Statistics

nx = 16
nz = 8

a = rand(nx, nz)
b = rand(nx, nz)

# This gives a vertical profile of 8 layers. This is the right shape,
# but without the mask applied.
a_prof = mean(a, dims=1)

# This gives a scalar, but I would like a vertical profile of 8 layers,
# with only the indices where b < 0.5 in the computation.
a_prof_b = mean(a[b .< 0.5], dims=1)

One suggestion:

a_prof_b = [mean(a[b[:,i] .< 0.5, i]) for i in 1:size(a,2)]
1 Like

Looks like your arrays are related and have corresponding indices, so it makes sense to keep and use them together:

using SplitApplyCombine
using StructArrays

# reproduce your simple mean of a with SplitApplyCombine functions:
map(splitdims(a, 2)) do x

# combine a and b arrays to use them together:
AB = StructArray(; a, b)

# compute the desired conditional mean:
map(splitdims(AB, 2)) do x
	mean(x.a[x.b .< 0.5])
1 Like