Median vs 50th Quantile giving different answers



I’m getting different answers for median(v,w::AnalyticWeights) and quantile(v,w::AnalyticWeights,0.5) and not sure why. Any ideas?


Please provide an example

using StatsBase; using Distributions
v=[1; 4; 3; 2; 2.5; 7];w=[0.1;0.3;0.05;0.05;0.2;0.3]

Median returns 4.0 and Quantile returns 3.5.


Sometimes a quantile isn’t uniquely defined (often solved by taking the average of the endpoint of the interval of quantile points). However, it only makes sense to use a definition that ensures that median and 0.5 quantile are identical.

In this case, it seems that things are worse. To me, it seems that the result of quantile is just wrong. The 0.5 quantile and the median, say m, is the same thing and should satisfy P(X \geqslant m) \geqslant \frac{1}{2} and P(X \leqslant m) \geqslant \frac{1}{2}. For your inputs, I get

julia> x = [1; 4; 3; 2; 2.5; 7];

julia> w = [0.1;0.3;0.05;0.05;0.2;0.3];

julia> sum(w[x .<= 3.5])

so 3.5 is not a median. To see that 4 is the unique median, you can create the following table and see that the row with x=4 is the only one that has probabilities higher than \frac{1}{2}.

julia> p = sortperm(x);

julia> table(cumsum(w[p]), reverse(cumsum(reverse(w[p]))), x[p], names = [Symbol("P(X<=x)"), Symbol("P(X>=x)"), :x])
Table with 6 rows, 3 columns:
P(X<=x)  P(X>=x)  x
0.1      1.0      1.0
0.15     0.9      2.0
0.35     0.85     2.5
0.4      0.65     3.0
0.7      0.6      4.0
1.0      0.3      7.0


See also and discussion at Matthieu Gomez is the person to contact about this, but he isn’t on Discourse AFAICT.

Note that quantile(v, fweights(w)) gives yet another (incorrect) answer (7.0).


Simply, when number of elements is even, it is assigned to a mean of two central elements.


It’s not clear why 3 should be considered a central element here though given the weight vector.