I have a bunch of coordinates and corresponding values:
n = 100 xy = [rand(2) for _ in 1:n] v = rand(n)
I want to 2D-bin the coordinates, and calculate the mean of all the values,
v, that fall into each corresponding bin.
For example, if only these two coordinate-value pairs fell into the same bin (e.g with edges
([0,0.2), [0,0.2))) :
xy1 = (0.1, 0.1) v1 = 0.0 xy2 = (0.01, 0.19) v2 = 1.0
then I expect that the bin containing them would have the mean of these coordinates’ values:
(0 + 1)/2 = 0.5.
Do you know of any “pakaged” operation that can accomplish this?
The only way I can think of is:
In the standard form of a histogram the bin would contain their count (ignoring the values), and in the weighted form of a histogram (see docs here), the bin would contain the sum of the values. So I could calculate the weighted and unweighted histograms, divide the weighted by the unweighted and get what I want:
using StatsBase x = first.(xy) y = last.(xy) edges = (0:0.2:1, 0:0.2:1) uw = fit(Histogram, (x, y), edges) w = fit(Histogram, (x, y), weights(v), edges) h = w.weights./uw.weights