Kolmogorov-Smirnov test

GHTaarn · November 9, 2022, 3:38pm

I have a sample consisting of a set of integers and want to test if can be from a specific distribution. I am not that strong in statistics, but my understanding is that the Kolmogorov-Smirnov test is a good choice for assessing this, so I tried the code below:

using HypothesisTests, Distributions
n = 50
x = 0.5 .^ (1:n)
d = DiscreteNonParametric(1:n, x ./ sum(x))
sample = vcat(ones(Int64,16), 2*ones(Int64,8), 3*ones(Int64,4), 4,4,5)
display(ExactOneSampleKSTest(sample, d))

The problem is that this rejects the hypothesis, even though the sample seems to fit the distribution almost perfectly. Can anyone explain why this happens? The output is given below:

┌ Warning: This test is inaccurate with ties
└ @ HypothesisTests ~/.julia/packages/HypothesisTests/BgrVj/src/kolmogorov_smirnov.jl:68
Exact one sample Kolmogorov-Smirnov test
----------------------------------------
Population details:
    parameter of interest:   Supremum of CDF differences
    value under h_0:         0.0
    point estimate:          0.5

Test summary:
    outcome with 95% confidence: reject h_0
    two-sided p-value:           <1e-06

Details:
    number of observations:   31

awasserman · November 9, 2022, 5:38pm

It seems to me like the default tail value for pvalue(t::ExactOneSampleKSTest, tail=:both) is wrong. I think it should use tail = :right, since we’re interested in the probability of the KS statistic being less than the value we have.

If we do pvalue(ExactOneSampleKSTest(sample, d), tail=:right), we get p = 0.92, which seems more reasonable. To be honest, I’m a little confused by the implementation of this in HypothesisTests.

GHTaarn · November 9, 2022, 7:49pm

Thank you, you certainly seem to know much more about this than me, so I think I will file an issue with the developers and see if I can get some clarity on whether this is a bug or just a misunderstanding.

StefanKarpinski · November 9, 2022, 8:11pm

When you file the issue can you post a link here? People might be interested in the discussion, but it is definitely to discuss it there.

Paul_Soderlind · November 9, 2022, 8:18pm

In case you want to double check the results, cell 11 or so in these lecture notes has a simple code for that: FinancialEconometrics/Ch19_TestingDistributions.ipynb at master · PaulSoderlind/FinancialEconometrics · GitHub

awasserman · November 10, 2022, 12:18am

Looking at this more closely, I don’t think it’s a problem with the choice of tails, but rather in the computation of the KS statistic.

See in comparison to the implementation in scipy.stats

HypothesisTests.jl:

julia> print(x)
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 5]
julia> print(y)
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 4, 5, 8]
julia> t = ApproximateTwoSampleKSTest(x, y)
┌ Warning: This test is inaccurate with ties
└ @ HypothesisTests ~/.julia/packages/HypothesisTests/BgrVj/src/kolmogorov_smirnov.jl:167
Approximate two sample Kolmogorov-Smirnov test
----------------------------------------------
Population details:
    parameter of interest:   Supremum of CDF differences
    value under h_0:         0.0
    point estimate:          0.516129

Test summary:
    outcome with 95% confidence: reject h_0
    two-sided p-value:           0.0005

Details:
    number of observations:   [31,31]
    KS-statistic:              2.032002032003047

vs scipy stats

In [13]: stats.kstest(x, y, alternative="two-sided", method="asymp")
Out[13]: KstestResult(statistic=0.032258064516129115, pvalue=1.0)

In [14]: stats.kstest(x, y, alternative="greater", method="asymp")
Out[14]: KstestResult(statistic=0.032258064516129115, pvalue=0.9375209928337668)

In [15]: stats.kstest(x, y, alternative="less", method="asymp")
Out[15]: KstestResult(statistic=0.032258064516129004, pvalue=0.9375209928337671)

GHTaarn · November 10, 2022, 5:41am

I have now filed the issue with the developers ExactOneSampleKSTest rejects hypothesis incorrectly · Issue #280 · JuliaStats/HypothesisTests.jl · GitHub

Thanks to all that participated in this thread.

PharmCat · November 10, 2022, 9:05am

SPSS28 test for this data

NPTESTS 
  /INDEPENDENT TEST (V) GROUP (G) KOLMOGOROV_SMIRNOV 
  /MISSING SCOPE=ANALYSIS USERMISSING=EXCLUDE 
  /CRITERIA ALPHA=0.05  CILEVEL=95.

Dan · November 10, 2022, 12:01pm

I think this is the case to pay attention to the warning. The calculation of KS statistic involves sorting and because of stability of sort (i.e. preserve order of identical elements) problems can get exacerbated to extreme.

For example:

julia> pvalue(ApproximateTwoSampleKSTest(sample, sample), tail=:both)
┌ Warning: This test is inaccurate with ties
└ @ HypothesisTests ~/.julia/packages/HypothesisTests/BgrVj/src/kolmogorov_smirnov.jl:167
0.000518320212633612    # rejects strongly

julia> pvalue(ApproximateTwoSampleKSTest(
  sample.+rand(length(sample))/100, 
  sample.+rand(length(sample))/100), tail=:both)

0.8148352946917117     # "nothing to see here"

julia> histogram([pvalue(
  ApproximateTwoSampleKSTest(sample.+rand(length(sample))/100,
  sample.+rand(length(sample))/100), tail=:both) for i in 1:1000])  
  # uses UnicodePlots
              ┌                                        ┐ 
   [0.0, 0.1) ┤▎ 3                                       
   [0.1, 0.2) ┤▋ 11                                      
   [0.2, 0.3) ┤█▍ 21                                     
   [0.3, 0.4) ┤  0                                       
   [0.4, 0.5) ┤███▋ 55                                   
   [0.5, 0.6) ┤  0                                       
   [0.6, 0.7) ┤█████████▏ 134                            
   [0.7, 0.8) ┤  0                                       
   [0.8, 0.9) ┤█████████████████▍ 257                    
   [0.9, 1.0) ┤███████████████████████████████████  519  
              └                                        ┘ 
                               Frequency                 

julia> count([pvalue(
  ApproximateTwoSampleKSTest(
  sample.+rand(length(sample))/100, 
  sample.+rand(length(sample))/100), tail=:both) for i in 1:1000] .< 0.05)
0

It is possible to mitigate this tie-breaking problem in the HypothesisTests package (optionally, and in a way that allows reproducability of tie-breaking). But the results should be okay with a little jitter on the caller side (size of jitter dependent on underlying sample and distrubution).

Note:

sample is [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 5], and jitter is small compared to samples.
The distribution of p-values is a bit non-uniform, which merits a bit of investigating.

StefanKarpinski · November 10, 2022, 2:53pm

I vaguely recall that the KS test assumes no ties. Is that perhaps the problem here? Using a stable sort would essentially have the effect of breaking all ties by order of appearance, I’m not sure if that’s sufficient or necessary to fix this.

GHTaarn · November 10, 2022, 3:31pm

I tried to add jitter to my example and it removed the warning, but the result was the same.

I am beginning to suspect that the implementation in ExactOneSampleKSTest does not work for discrete distributions. According to The Kolmogorov-Smirnov Goodness-of-fit Test - YouTube the Kolmogorov-Smirnov test can be used on discrete distributions, but I would not expect that code made for continuous distributions would give reliable results for discrete distributions, regardless of whether jitter is added to the sample(s) or not.

mschauer · November 10, 2022, 3:40pm

Yes, the Kolmogorov Smirnov test uses the continuity of the distribution. For discrete distributions it is more conservative, so not clear why it rejects, can you plot the empirical CDF versus the theoretical CDF?

GHTaarn · November 10, 2022, 4:09pm

I would expect that it rejects because it bases its decision on the x value that has the largest difference between the two CDFs and since the theoretical CDF is discontinuous and e.g. jumps from 0 to 0.5 for x=1, then the best case scenario for the largest difference would be ~0.25 which it apparently feels is enough to reject.

Dan · November 10, 2022, 4:24pm

They should work with discrete distributions but I think the implementations were written with the assumption of no ties, and so the tie case is not handled gracefully.

The jitter solution is good for the two-sample case (as shown in my previous response). The one-sample and two-sample cases are implemented separately and need different “fixes”.

Adding below an initial “fix” implementation, which you can test. Tried to pay attention to ties in generalizing the original implementation.

using HypothesisTests, Distributions, Random, StatsBase

import HypothesisTests: ksstats

#### UPDATE: This function isn't correct. Don't use.
# used in one-sided test
# function ksstats(x::AbstractVector{T}, d::UnivariateDistribution) where T<:Real
#     n = length(x)
#     cdfs = cdf.(Ref(d), sort(x))
#     δp = maximum(competerank(x) ./ n .- cdfs)  # fixed ranks from 1:n
#     δn = -minimum((n .- competerank(x, lt=(>))) ./ n .- cdfs)
#     δ = max(δn, δp)
#     (n, δ, δp, δn)
# end

# used in two-sided test
function ksstats(x::AbstractVector{T}, y::AbstractVector{S}) where {T<:Real, S<:Real}
    n_x, n_y = length(x), length(y)
    n = n_x + n_y
    randperm = shuffle(1:n)   # add a random perm to random tie-break
    sort_idx = sortperm([x; y][randperm])
    pdf_diffs = [ones(n_x)/n_x; -ones(n_y)/n_y][randperm][sort_idx]
    cdf_diffs = cumsum(pdf_diffs)
    δp = maximum(cdf_diffs)
    δn = -minimum(cdf_diffs)
    δ = max(δp, δn)
    (n_x, n_y, δ, δp, δn)
end

With these overwritten ksstats function, the original code should/might work.

GHTaarn · November 10, 2022, 4:55pm

Thank you, I ran this on the example in the original post and it “failed to reject h_0”, but it now seems to do this even when I would expect it to reject:

julia> ExactOneSampleKSTest(fill(31,9999),d)
┌ Warning: This test is inaccurate with ties
└ @ HypothesisTests ~/.julia/packages/HypothesisTests/BgrVj/src/kolmogorov_smirnov.jl:68
Exact one sample Kolmogorov-Smirnov test
----------------------------------------
Population details:
    parameter of interest:   Supremum of CDF differences
    value under h_0:         0.0
    point estimate:          0.00010001

Test summary:
    outcome with 95% confidence: fail to reject h_0
    two-sided p-value:           1.0000

Details:
    number of observations:   9999


julia>

using the same d as in the original post.

Dan · November 10, 2022, 5:04pm

Oops. Need to have a look. In the meantime, the two-sample test is a little simpler and so probably correct now. It can be used as a stop gap:

julia> ApproximateTwoSampleKSTest(fill(1,9999), rand(d,9999))
┌ Warning: This test is inaccurate with ties
└ @ HypothesisTests ~/.julia/packages/HypothesisTests/BgrVj/src/kolmogorov_smirnov.jl:167
Approximate two sample Kolmogorov-Smirnov test
----------------------------------------------
Population details:
    parameter of interest:   Supremum of CDF differences
    value under h_0:         0.0
    point estimate:          0.50155

Test summary:
    outcome with 95% confidence: reject h_0
    two-sided p-value:           <1e-99

Details:
    number of observations:   [9999,9999]
    KS-statistic:              35.46317827975053

Dan · November 10, 2022, 6:36pm

It seems K-S test is not really suited for discrete distributions as-is, but has been extended to support them. The improved test is Cramer-von-Mises test. A good reference (I haven’t had time to read): http://www.stat.yale.edu/~jay/EmersonMaterials/DiscreteGOF.pdf.

Perhaps a Chi-square Goodness-of-fit test is worth exploring for this problem.

Dan · November 11, 2022, 1:26pm

Continuing from last post, here is a Chi-squared test for the discussed problem:

onesampleGOF(sample, d::DiscreteNonParametric) = begin
    cm = [get(countmap(sample),k,0) for k in d.support]
    sum(values(cm)) < length(sample) && error("Sample has elements outside support")
    return ChisqTest(cm, d.p)
end

and then it is possible to:

julia> pvalue(onesampleGOF(2 .* sample, d))
7.485816224539387e-10
julia> pvalue(onesampleGOF(sample, d))
1.0
julia> pvalue(onesampleGOF(fill(31,10), d))
0.0
julia> pvalue(onesampleGOF(rand(d,100), d))
0.9999999116094588
julia> pvalue(onesampleGOF(rand(1:51,10), d))
0.0

GHTaarn · November 14, 2022, 7:00pm

Thank you, so this is how to do a Chi Squared test with HypothesisTests, I actually ended up making my own basic implementation.

conalbrown · April 29, 2025, 8:24pm

I’ve been digging around in kolmogorov_smirnov.jl and found that the pvalue() function calculates the value of n on line 179 differently to other references.

function StatsAPI.pvalue(x::ApproximateTwoSampleKSTest; tail=:both)
    n = x.n_x*x.n_y/(x.n_x+x.n_y)
    if tail == :left
        exp(-2*n*x.δn^2)
    elseif tail == :right
        exp(-2*n*x.δp^2)
    elseif tail == :both
        pvalue(Kolmogorov(), sqrt(n)*x.δ; tail=:right)
    else
        throw(ArgumentError("tail=$(tail) is invalid"))
    end
end

Line 179 is

n = x.n_x*x.n_y/(x.n_x+x.n_y)

The other references I have seen divide the sum by the product (or add the inverses of n and m) [e.g. 1, 2, 3, 4] , whereas the product of n and m (n_x and n_y) is divided by the sum above.

Is there a mistake in the code or am I misinterpreting how the function goes on to calculate the p-value? I suspect the latter as it is calculating the p-value rather than the critical distance.

I notice this issue is still open in Github so replying off this post. Please moderate my post if needed and advise whether I should just post this directly to Github.

Topic		Replies	Views
How do I extract more than a pvalue HypothesisTests.jl? Statistics question , package , hypothesis-tests	1	120	September 30, 2024
Chi-Square test of a sample Statistics question	19	3521	July 16, 2021
HypothesisTests, rand and very low pvalue Statistics question	1	253	December 21, 2022
What's wrong with my chi-squared goodness of fits tests? Statistics	5	1761	July 10, 2021
Kolmogorov–Smirnov test, retrieve output Statistics question	2	1978	January 14, 2017

Kolmogorov-Smirnov test

Related topics