Julia has many statistics-computing functions, typically a “reduce” kind of function, that can take either a simple vector, or go over an array and compute the output over row, columns, etc according to the dims parameter.
Is there a version for this that goes over a collection of vectors, tuples or Points?
According to my understanding, the function map is way more useful than reduce in your statistics context. Because reduce is for 2-arg functions e.g. * and +.
I’m not sure about this either, but the docs do mention explicitly that maximum, sum, etc. should be preferred over writing out reduce(max, ...), reduce(+, ...) etc. if possible. So in that example, I agree that maximum(abs2, v) is more intuitive.
Apologies that my first message was not too clear. These examples should better illustrate what happens.
julia> data = randn(Point3f,111);
julia> mean(data)
3-element Point{3, Float32} with indices SOneTo(3):
0.09946571
0.043787897
0.029109783
julia> median(data)
-0.20733127f0
julia> maximum(data)
3-element Point{3, Float32} with indices SOneTo(3):
2.6677918
1.3926599
-0.09724049
The challenge is that Point has specific semantics associated with each of these operations I’m interested in. For mean or sum, it works as I want. For the other operations, that’s not what I want. I want to apply that computation across each different dimension. This would take some kind of specialized functor for Points that does that, it would be the same as stacking the vector of points, then doing maximum(xx, dims=2), and then returning that as a Point. The whole point of this is that I’m trying to use Point more often instead of vectors, so I’m trying to make operations that I commonly use with vectors and arrays easier to perform when my data is represented as a collection of Points.
Addition and scaling of Point3fs indeed behaves as you expect, i.e. componentwise, and therefore sum and mean work for you without issue. But Point3fs are sorted lexicographically, so maximum(data), which should return the largest Point3f in data according to this order, typically simply yields the one with largest first component.
I’m not completely sure what the rationale is for median, but we get the middle of the midpoint according to the order: mean(extrema(sort(data)[56])). In fact, as median(rand(Point3f, 2)) throws, this is probably not intended. Personally, I would expect to get a Point3f back: the (mean of the) middle (two) Point3f(s), where we again use the default lexicographic order.
In any case, my point is that Vector{Point3f}s are not the type to use if you want e.g. componentwise maximum. But you can always just ‘convert’ it to a Matrix for free using reinterpet and then use the maximum(..., dims=2) you mentioned: