Cummean, cumall, and cumany

Currently, these functions exist in Base:

• cumsum
• cumprod
• cummin (using accumulate)
• cummax (using accumulate)

I’m just checking out dplyr’s functions and they have

• cummean
• cumall (logical bit vector)
• cumany (logical bit vector)

Does it make sense to add these to Base? Or perhaps all these functions should be moved to Statistics?

1 Like

Well, we have `accumulate`, in terms of which `cumall` is just `accumulate(&, A)` and `cumany` is `accumulate(|, A)`, so it doesn’t seem worthwhile adding specific functions for these.

`cummean(A) = cumsum(A) ./ (1:length(A))` is slightly more complicated, at least if you want to eliminate the allocation of a temporary array. If it’s a common operation it might make sense in Statistics. However, I can’t seem to find any record of anyone ever asking for this function before, so that doesn’t seem to indicate a big demand?

7 Likes

Ironically, I am currently porting some code from Matlab that contains the equivalent of `cummean` as described here. So, from me!

1 Like

Yeah I couldn’t find any previous requests from the julia repo either.

Looks like Bayesian people want it?

https://juliahub.com/ui/CodeSearch?q=cummean%20&u=define&t=all

Just to do some publicity to a couple of really nice packages, you could do the cumulative mean with OnlineStats + Transducers:

``````julia> using OnlineStats, Transducers

julia> collect(Transducer(Mean()), 1:4)
``````

(It’s one of the examples in the docs.)

9 Likes

Did not know this. Thanks for sharing.

Is it fast? I thought the canonical way in Julia is `accumulate`

`Transducer(::OnlineStat)` is not really for performance. It’s mainly for exposing the super rich set of functionality that exists in OnlineStats.jl to Transducers.jl-based API. OnlineStats are really “just” reducing functions obscured by the implementation detail that uses in-place mutation. So, it’d be such a pity if two libraries cannot talk to each other.

3 Likes