[ANN] PermutationTests.jl, a package for multiple hypothesis testing

Hello, i am glad to annonce PermutationTests.jl, a fast, comprehensive and well-documented package for univariate and multiple comparisons hypothesis tests by data permutation.

In practice, this is useful when your data does not verify the assumptions of parametric tests and when you need to tests many hypotheses simultaneously, some or all of which may be correlated. This happens, for example, in neuroimaging and genome expression studies.

More in general, by means of permutation tests you can get a powerful test with minimal assumptions fully controlling the Type I error rate.

In my knowledge, there is no other julia package dedicated to multiple hypothesis testing yet, thus PermutationTests.jl fills a gap in the statistics julia ecosystem.

6 Likes

It seems nice, great work.

I have several suggestions:

  • The output color after rTest, I have no black background, so it is not easy to see the remarked in color.
  • It is not clear the target. What is different in comparisons with other similar packages? You mention them in the documentation, but a small comment will be nice.
  • The list of tests is closed? I ask that because I use a lot non-parametric test, and I am not sure if they could considered under the package.

Anyway, thank you a lot for your package, and their documentation.

Hi,

here are some answers:

  • Output color: Good point, i will make it a little darker
  • The scope of the package: Here are some more explanations: The main advantage of using permutation tests is when you need to perform a large number of tests simultaneously and the hypotheses may be correlated. In this case permutation tests offer the greatest power while rigorously controlling the family-wise error rate. For example, if you estimate brain activity in thousands of voxels, such activity will be correlated for sure locally in the brain and maybe also non-locally. Using a Bonferroni-like of FDR-like correction (the standard procedures for controlling for multiple comparisons), will result in less power. Actually, permutation tests are nice for many reasons. Here are a few more: you can get exact test; you can test whatever test-statistic (say, whatever coefficient you may extract from your data), not only the usual test-statistic such as the Student-t, F, etc. for which the distribution under the null hypothesis is known; your test adapts automatically to the form and degree of correlation among hypotheses; they are more robust to outliers; they make use of much less stringent assumptions as compared to parametric tests (for example, Gaussianity of the data distribution,…); as a consequence of this last characteristic, you do not need to resort to rank-based statistics because your data violate an assuption of the parametric test,…
  • The list of tests: no, it is not closed. New tests can be coded in PermutationTests.jl, or you can create your own test just using the package.

Note that, in general, you do not need to use non-parametric tests if you use permutation tests; the univariate test will always be valid and exact (or approximatively exact) using the permutation test that is equivalent to the parametric test you wish to use, it does not matter the distribution of the data. As a matter of fact, many non-parametric tests ARE permutation tests; they are performed on ranked data so that the p-value can be obtained without actually listing the permutations. For instance, the popular Spearman correlation, Mann-Whitney and many others, are in fact permutation tests!

Check the references i give in the documentation for more information.