Cummean, cumall, and cumany

Currently, these functions exist in Base:

  • cumsum
  • cumprod
  • cummin (using accumulate)
  • cummax (using accumulate)

I’m just checking out dplyr’s functions and they have

  • cummean
  • cumall (logical bit vector)
  • cumany (logical bit vector)

Does it make sense to add these to Base? Or perhaps all these functions should be moved to Statistics?

1 Like

Well, we have accumulate, in terms of which cumall is just accumulate(&, A) and cumany is accumulate(|, A), so it doesn’t seem worthwhile adding specific functions for these.

cummean(A) = cumsum(A) ./ (1:length(A)) is slightly more complicated, at least if you want to eliminate the allocation of a temporary array. If it’s a common operation it might make sense in Statistics. However, I can’t seem to find any record of anyone ever asking for this function before, so that doesn’t seem to indicate a big demand?

7 Likes

Ironically, I am currently porting some code from Matlab that contains the equivalent of cummean as described here. So, :+1: from me!

1 Like

Yeah I couldn’t find any previous requests from the julia repo either.

Looks like Bayesian people want it?

https://juliahub.com/ui/CodeSearch?q=cummean%20&u=define&t=all

Just to do some publicity to a couple of really nice packages, you could do the cumulative mean with OnlineStats + Transducers:

julia> using OnlineStats, Transducers

julia> collect(Transducer(Mean()), 1:4)

(It’s one of the examples in the docs.)

9 Likes

Did not know this. Thanks for sharing.

Is it fast? I thought the canonical way in Julia is accumulate

Transducer(::OnlineStat) is not really for performance. It’s mainly for exposing the super rich set of functionality that exists in OnlineStats.jl to Transducers.jl-based API. OnlineStats are really “just” reducing functions obscured by the implementation detail that uses in-place mutation. So, it’d be such a pity if two libraries cannot talk to each other.

3 Likes