Hi everyone,
I’m working on a small personal project and wrote this sigma clipping function in Julia. It works fine so far, but I’m definitely not an expert and I feel like it could be way more efficient — especially to reduce memory allocations and speed up runtime.
The function is meant to be generic, so it should work both on vectors and matrices, and also allow the user to choose:
- which functions to use for the center (mean, median, etc.)
- which function to use for the standard deviation (std, mad, etc.)
- and which clamp value to apply when masking outliers (
clampvalue
)
Implementation details
Currently, I’m using a boolean mask (BitVector) to keep track of which elements are considered “good” or clipped out. This mask is updated at each iteration of the clipping loop.
To compute the center and standard deviation on the “good” subset, I use view(x, mask)
— i.e., a view of the array selecting only the elements where the mask is true
.
Inside the loop, I iterate over every element and update the mask based on the clipping criteria.
Finally, I apply the mask to the output array, replacing clipped values with the chosen clampvalue
.
using Statistics
function mclip!(a, mask, cval)
@inbounds @simd for p in eachindex(mask)
if !mask[p]
a[p] = cval
end
end
end
function sigma_clip!(x::AbstractArray{T},
out::AbstractArray{T};
sigma_low = 3,
sigma_upp = 3,
cent_func = mean,
std_func = std,
maxiters::Int = 5,
clampvalue = NaN) where {T}
mask = trues(size(x))
m = cent_func(x)
s = std_func(x)
r = 1 # number of clipped elements
n = 0
while (n <= maxiters) && (r != 0)
r = 0
@inbounds for p in eachindex(x)
if x[p] > m + (s * sigma_upp) || x[p] < m - (s * sigma_low)
mask[p] = false
r += 1
end
end
m = cent_func(view(x, mask))
s = std_func(view(x, mask))
n += 1
end
mclip!(out, mask, clampvalue)
return out
end
I’d appreciate any tips on how to:
Cut down temporary allocations inside the loop
Avoid unnecessary copies if possible
Or anything else that could make this faster, cleaner, or more Julian
Thanks so much for your patience and help with a newbie like me!