What's wrong with my chi-squared goodness of fits tests?

sylvaticus · July 9, 2021, 9:21am

Hi there. I am following a MOOC where I am learning goodness of fits tests, in particular the chi-square test for discrete rv and the Kolmogorov-Smirnov test for continuous rv.

I implemented the tests in the following code. It gives results that are consistent with the output of the exercises in the course, however when I simulate the test on true H0 even with LOT of repetitions, I have very low type 1 errors, much lower than the alpha level set in the test, except I am working for uniforms for which the tests accept (wrongly) H1 with a rate alpha.

Here the difinition of the tests…

using Distributions

# The test definitions.....
function computeDensity(data,support)
    counts =  [count(i -> i==s,data) for s in support]
    if length(data) > sum(counts)
        error("There are some data not in the support !")
    end
    return counts
end

""" 
    goodnessOfFitDiscrete(data,support,f₀;compressedData=true,α=0.05,d=0)

Perform a goodness to fit chi-squared test to check for a particular MMF.

The passed distribution must support the method `pdf(dist,x)` for the provided support.
H₀ can be either the PDF with a specified set of parameters or the PDF in general. In this case the distribution object should be passed to this function with the MLE estimators that best fit the data (it is NOT done inside this function). In such case  the `d` parameter should be set to the number of estimated parameters in order to remove the `d` degree of freedom from the chi-square test.
"""
function goodnessOfFitDiscrete(data,support,f₀;compressedData=true,α=0.05,d=0)
    if !compressedData
        data   = computeDensity(data,support)
    end
    K          = length(support)
    N          = sum(data)
    p̂          = data ./ N
    df         = K - d - 1
    p0         = pdf.(f₀,support)
    T          = N * sum((p̂[k] - p0[k])^2/p0[k] for k in 1:K)
    χDist      = Chisq(df)
    rejectedH₀ = T > quantile(χDist,1-α)
    p_value    = 1 - cdf(χDist,T)
    return (testValue=T, threashold=quantile(χDist,1-α),rejectedH₀=rejectedH₀, p_value=p_value)
end

When I use the Uniform distribution, I obtain the expected number of errors:

f₀          = DiscreteUniform(1,100)
repetitions = 5000
support     = 1:100
outs        = fill(false,repetitions)
for rep in 1:repetitions
    data      = rand(f₀,10000)
    out       = goodnessOfFitDiscrete(data,support,f₀,compressedData=false,α=0.05,d=0)
    outs[rep] = out.rejectedH₀
end
sum(outs)/repetitions # around 0.05

However when I use an other distribution, even a simmetric and unimodal one like the Binomial, I have much lower numbers:

f₀          = Binomial(100,0.7)
repetitions = 5000
support     = 0:100
outs        = fill(false,repetitions)
for rep in 1:repetitions
    data      = rand(f₀,10000)
    out       = goodnessOfFitDiscrete(data,support,f₀,compressedData=false,α=0.05,d=0)
    outs[rep] = out.rejectedH₀
end
sum(outs)/repetitions # 0.02 or 0.03 instead of 0.05

Crazy enough, if I did try to increase alpha to e.g. 0.1 or 0.2, I can see that the output of the test when using the uniform adapt and match the new alpha, but for the binomial it remains to very low levels… even if I set my test to reject H0 even with a very high error probability, it almost never does!

Is there an “official” implementation of the chi-square test for goodness of fit I can try with ?

sylvaticus · July 9, 2021, 10:02am

It seems I am giving the same results as the test on HyphothesisTesting.jl package:

But this means that the power of the test, in the worst case of the true parameter different but very close to the H0 one, is lower than the expected one, i.e. the test is more conservative than what theoretically should be…

Christopher_Fisher · July 9, 2021, 11:29am

If I remember correctly, the test assumes an expected frequency of 5 or more for 80% or more categories. Does that hold in your simulation?

sylvaticus · July 9, 2021, 12:08pm

I wasn’t aware of that heuristics.
My Binomial(100,0.7) on x <= 19 has a PMF of just 6.689300172065137e-26 or less, so even if I sample 2000 observations I will likely have no observations on x <= 19.

Thanks for the info!

Christopher_Fisher · July 9, 2021, 12:27pm

No problem. In that case, a common solution is to combine categories so that the assumptions of the test are satisfied. The sources I found were not clear, but correcting the categories might get you closer to the theoretical alpha of .05.

genkuroki · July 10, 2021, 9:47am

Additional details

The chi-square test can only be used if the distribution of testValues is approximated by the chi-square distribution.

In the standard goodness-of-fit test, the degree of freedom of the chi-square distribution should be equal to (the number of elements in the support of the discrete distribution to be tested) - 1.

However, the real (or substantial) support of Binomial(100, 0.7) is at most between 40 and 100. So, in this case, we can expect that the degree of freedom should be roughly around 60.

Let’s check this in the following.

The mathematically idealized support of DiscreteUniform(0, 100) and Binomial(100, 0.7) are both 0:100, but the real support of the latter is much smaller than the former.

using Distributions, StatsPlots
P = plot(DiscreteUniform(0, 100); label="DiscreteUniform(0, 100)", xtick=0:10:100, ylim=(0, 0.013))
Q = plot(Binomial(100, 0.7); label="Binomial(100, 0.7)", legend=:topleft, xtick=0:10:100)
plot(P, Q; size=(800, 250))

using Distributions, Random, StatsPlots

# The test definitions.....
function computeDensity(data, supp)
    counts =  [count(i -> i==s,data) for s in supp]
    if length(data) > sum(counts)
        error("There are some data not in the support !")
    end
    return counts
end

"""
Modified version of `goodnessOfFitDiscrete` function in
https://discourse.julialang.org/t/whats-wrong-with-my-chi-squared-goodness-of-fits-tests/64334

* The old argument `support` is deleted.
* The new local variable `supp` is defined to be `support(f₀)`
"""
function goodnessOfFitDiscrete(data, f₀; compressedData=true, α=0.05, d=0)
    supp       = support(f₀)
    if !compressedData
        data   = computeDensity(data, supp)
    end
    K          = length(supp)
    N          = sum(data)
    p̂          = data ./ N
    df         = K - d - 1
    p0         = pdf.(f₀,supp)
    T          = N * sum((p̂[k] - p0[k])^2/p0[k] for k in 1:K)
    χDist      = Chisq(df)
    rejectedH₀ = T > quantile(χDist, 1-α)
    p_value    = 1 - cdf(χDist, T)
    return (testValue=T, threashold=quantile(χDist,1-α), rejectedH₀=rejectedH₀, p_value=p_value)
end

function repeat_tests(f₀; datasize = 10000, repetitions = 10000, α = 0.05, d = 0)
    testValue = zeros(repetitions)
    rejectedH₀ = falses(repetitions)
    data = rand(f₀, datasize)
    for rep in 1:repetitions
        rand!(f₀, data)
        out = goodnessOfFitDiscrete(data, f₀; compressedData=false, α, d)
        testValue[rep] = out.testValue
        rejectedH₀[rep] = out.rejectedH₀
    end
    α_real = sum(rejectedH₀)/repetitions
    (; α_real, testValue, rejectedH₀)
end

function plot_testValue_dist(f₀; datasize = 10000, repetitions = 10000, α = 0.05, d=0)
    α_real, testValue, rejectedH₀ = repeat_tests(f₀; datasize, repetitions, α, d)

    title = "$f₀   (real α = $α_real / nominal α = $α)"
    plot(; title, titlefontsize=10)

    xlim=(0, 2length(support(f₀)))
    a, b = xlim
    histogram!(testValue[a .≤ testValue .≤ b]; norm=true, alpha=0.3, label="testValue", xlim)

    df = length(support(f₀)) - d - 1
    plot!(Chisq(df), a, b; label="Chisq(df = $df)", ls=:dash, lw=2)
end

In the DiscreteUniform(0, 100) case, the distribution of testValues is almost exactly equal to the chi-square distribution.

plot_testValue_dist(DiscreteUniform(0, 100))

2021-07-10 (1)

But, in the Binomial(100, 0.7) case, this is not the case. The real (or empirical) α value is much smaller than the nominal α value.

plot_testValue_dist(Binomial(100, 0.7))

2021-07-10 (2)

When the degree of freedom is set to 60, the empirical α value is close to the nominal α value 0.05.

plot_testValue_dist(Binomial(100, 0.7); d = 40)

Topic		Replies	Views
Kolmogorov-Smirnov test Statistics distributions , hypothesis-tests	22	2966	April 30, 2025
Example Chi Square test - why different answers from HypothesisTests and Distances? Statistics first-steps	6	4027	February 26, 2020
Chi-Square test of a sample Statistics question	19	3535	July 16, 2021
Testing if data is Gamma Distributed Statistics	7	3317	June 5, 2018
Hypothesis testing in Julia Statistics question	4	1826	March 14, 2022

What's wrong with my chi-squared goodness of fits tests?

Related topics