Mediand, std, var to accept function as first argument

statistics

#1

Functions sum, mean, maximum etc. accept a function as first argument

mean(f::Function, v)

  Apply the function f to each element of v and take the mean.

  julia> mean(√, [1, 2, 3])
  1.3820881233139908

Functions std, var, median, extrema does not. Is there any particular reason for this behaviour?


#2

I think that mean(::Function, ...) predates the nice broadcasting syntax (like so many similar convenience forms). This would be a good time to deprecate, then remove it.

As for the other statistics, you can always do std(f.(x)) and similar, or use a generator. Note that centered moments take two passes (unless you provide the mean), so the choice depends on whether f is expensive. For large amounts of data, there is always


#3

I don’t think broadcasting replaces these methods, since it (currently) allocates a new vector before computing the reduction. On the other hand, generators should be fine (though slightly more verbose).


#4

I frequently wish for a function like

lazymap(f, xs) = (f(x) for x in xs)

Or an annotation that makes broadcast forms return generators.


#5

Does this package help?


#6

I’ll add that (on master) OnlineStats has a new type that filters and transforms a data stream without allocating new arrays.

julia> s = series(Mean(); transform = abs, filter = !isnan);

julia> fit!(s, [-1, NaN, -3])
▦ AugmentedSeries{0}  |  EqualWeight  |  nobs = 2
▦ filter = Base.#57  |  transform = abs
└── Mean(2.0)

#7

@map from Query.jl is a lazy map that should work in these situations.


#8

Generator works exactly like lazymap. I wish it was exported.


#9

cf https://github.com/JuliaLang/julia/issues/20402#issuecomment-336280752


#10

Thanks for all the answers! I think the generators solve my use case for now (and they are not that verbose compared to what I came up with without generators std(cat(3,[fun(i) for i = 1:N]...), 3) :joy: )


#11

Perhaps you could make a one line PR? This would be a great feature.


#12

Apparently, generators are not quite like map:


#13

I could its worth thinking through though. Maybe there’s a better name than Generator?


#14

MappedArrays is a thing too. Hmm


#15

FWIW, I like Generator.


#16

What about Iterators.map?