# Median vs 50th Quantile giving different answers

I’m getting different answers for median(v,w::AnalyticWeights) and quantile(v,w::AnalyticWeights,0.5) and not sure why. Any ideas?

using StatsBase; using Distributions
v=[1; 4; 3; 2; 2.5; 7];w=[0.1;0.3;0.05;0.05;0.2;0.3]
median(v,weights(w)::AbstractWeights)
quantile(v,weights(w)::AbstractWeights,0.5)


Median returns 4.0 and Quantile returns 3.5.

Sometimes a quantile isn’t uniquely defined (often solved by taking the average of the endpoint of the interval of quantile points). However, it only makes sense to use a definition that ensures that median and 0.5 quantile are identical.

In this case, it seems that things are worse. To me, it seems that the result of quantile is just wrong. The 0.5 quantile and the median, say m, is the same thing and should satisfy P(X \geqslant m) \geqslant \frac{1}{2} and P(X \leqslant m) \geqslant \frac{1}{2}. For your inputs, I get

julia> x = [1; 4; 3; 2; 2.5; 7];

julia> w = [0.1;0.3;0.05;0.05;0.2;0.3];

julia> sum(w[x .<= 3.5])
0.4


so 3.5 is not a median. To see that 4 is the unique median, you can create the following table and see that the row with x=4 is the only one that has probabilities higher than \frac{1}{2}.

julia> p = sortperm(x);

julia> table(cumsum(w[p]), reverse(cumsum(reverse(w[p]))), x[p], names = [Symbol("P(X<=x)"), Symbol("P(X>=x)"), :x])
Table with 6 rows, 3 columns:
P(X<=x)  P(X>=x)  x
─────────────────────
0.1      1.0      1.0
0.15     0.9      2.0
0.35     0.85     2.5
0.4      0.65     3.0
0.7      0.6      4.0
1.0      0.3      7.0

2 Likes

Note that quantile(v, fweights(w)) gives yet another answer (7.0).
The inconsistency has been fixed by having median call quantile(x, w, 0.5). See this issue and the associated PR.