Fisher's test p-value results appear to differ from matlab, R

I was using HypothesisTests.jl and noticed minor, but possibly important differences in the p-value estimate between julia and two other packages when using Fisher’s exact test. In this example, matlab and R users might claim a significant result, whereas the julia user might not. This is a fairly common test, and it would be great if there was agreement. Thanks for considering this comment for discussion.

julia (v0.6):

julia> FisherExactTest(59, 335, 172, 1366)
Fisher's exact test
Population details:
    parameter of interest:   Odds ratio
    value under h_0:         1.0
    point estimate:          1.3984544219625261
    95% confidence interval: (0.9980930945998796, 1.9393947540537153)

Test summary:
    outcome with 95% confidence: fail to reject h_0
    **two-sided p-value:           0.051329212328076565    <------------------------**

    contingency table:
         59   335
        172  1366


>> x = table([59;172],[335;1366])
x =
  2×2 table
    Var1    Var2
    ____    ____
     59      335
    172     1366

>> [h,p,stats]=fishertest(x,'Tail','both','Alpha',0.95)

h =

**p =**
**   0.045036387203992    <--------------------------------**

stats = 
  struct with fields:
             OddsRatio: 1.398715723707046
    ConfidenceInterval: [1.384515635855077 1.413061452741955]


> x = matrix(c(59,172,335,1366), nrow = 2)
> x
     [,1] [,2]
[1,]   59  335
[2,]  172 1366
> fisher.test(x,alternative = "two.sided")

	Fisher's Exact Test for Count Data

data:  x
**p-value = 0.04503639                   <------------------------------**
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.9980904309 1.9393926010
sample estimates:
 odds ratio 

I’m afraid I can’t help with the statistics question, but it will be easier for others to read your post and help you if you quote your code with backticks.


Might look into it, but from a quick look, there are adjustments to the test for GLM such as those for count data which lead to the proper test being one-tailed. It might be that software usually take into account those cases as opposed to Julia which might not be aware that it is a count model and the user requests a two-tail version.

For asymmetric distributions, p-values for two-sided alternatives are not well defined. It has been discussed couple of times before. See and the reference there. See also the doc string for the pvalue method which states that

For tail = :both, possible values for method are:

    •    :central (default): Central interval, i.e. the p-value is two times the minimum of the
        one-sided p-values.

    •    :minlike: Minimum likelihood interval, i.e. the p-value is computed by summing all
        tables with the same marginals that are equally or less probable:

          p_ω = \sum_{f_ω(i)≤ f_ω(a)} f_ω(i)

so you can get the p-value used by R by specifying method, i.e.

julia> pvalue(x, method = :minlike)

This value is not more correct than the other one.


I’m curious, does R or Matlab give the option of calculating the :central value that is the default for Julia?

Robin, thanks for the advice on posting. Jose, thanks for advice on modeling. And Andreas, thank you for a great explanation. I was easily able to reproduce the matlab/R results. For a newbie like me who just used the inline REPL help, an additional sentence in the REPL help giving a little more detail on how alternative p-values can be obtained would be helpful!

R does (my first post, I hope I got it right!):

> library(exact2x2)
> x = matrix(c(59,172,335,1366), nrow = 2)
> x
     [,1] [,2]
[1,]   59  335
[2,]  172 1366
> exact2x2(x,tsmethod="central")

	Central Fisher's Exact Test

data:  x
p-value = 0.05133
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.9980904 1.9393926
sample estimates:
odds ratio