[ANN] RollingWindowArrays.jl - flexible and efficient rolling window operations

Wanted to announce RollingWindowArrays.jl, a small package I have been working on for rolling winow operations on arrays, in case anyone finds it useful:

julia> x = rand(6)
6-element Vector{Float64}:
 0.20267030347757775
 0.9461987020973889
 0.19249855800780524
 0.5748456080526526
 0.7032664181878268
 0.6079588708103658

julia> rolling(x, 3)
4-element RollingWindowArrays.RollingWindowVector{SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{Int64}}, true}, 1, Vector{Float64}, Nothing}:
 [0.20267030347757775, 0.9461987020973889, 0.19249855800780524]
 [0.9461987020973889, 0.19249855800780524, 0.5748456080526526]
 [0.19249855800780524, 0.5748456080526526, 0.7032664181878268]
 [0.5748456080526526, 0.7032664181878268, 0.6079588708103658]

Contrary to RollingFunctions.jl, rolling does not take a function argument but instead returns a RollingWindowVector that’s simply an AbstractVector of views into the original array. Reductions – e.g. a rolling mean – can then be applied simply via broadcasting:

julia> using Statistics

julia> mean.(rolling(x, 3))
4-element Vector{Float64}:
 0.4471225211942573
 0.5711809560526157
 0.4902035280827615
 0.6286902990169484

This allows one to take full advantage of Julia’s higher-level array abstractions, so something like the following “just works” without having to allocate any temporary arrays:

julia> maximum(mean, rolling(x, 3))
0.6286902990169484

This approach also turns out to be quite efficient:

julia> @b mean.(RollingWindowArrays.rolling($(rand(1000)), 11))
11.953 ÎĽs (3 allocs: 7.812 KiB)

julia> @b RollingFunctions.rollmean($rand(1000), 11)
29.676 ÎĽs (1491 allocs: 39.406 KiB, <0.01% compile time)

What is still missing compared to RollingFunctions.jl is the ability to have tapering or filled windows at the end points as I have not had a use for it yet.

Additional features include the ability to roll over multidimensional arrays and native support for DimArrays from DimensionalData.jl. For additional examples, check out the README .

22 Likes

Afaiu, it does use the O(n*k) naive algorithm for rolling mean, though, instead of the O(n) algorithm. Can you explicitly warn about that? O(n*k) is not what most people normally think of when they hear “rolling mean” / “rolling hash”!

Often the entire point is that you can shift the window in O(1) instead of O(k). (k is the window size)

And I would not be very surprised if someone pointed out a very cool paper with a fancy complicated non-obvious algorithm + runtime analysis to compute rolling min / max / median in expected O(n) or O(n log k) or something.

2 Likes

The underlying algorithm in FastRunningMedian.jl should scale as O(n log k).

2 Likes