Package for Confidence Intervals?

Is there a package dealing with these kinds of computations, or should I make my own functions for my purposes?

I use Measurements.jl.

Thanks for the reply.

I was looking at the package and wondering if it would fit what I want. At the moment it doesn’t look like it, but I could be wrong. Can you elaborate, and possibly give examples?

At the moment I’m at least looking, for example, for something that would simply let me plug values into functions to get confidence intervals for (arithmetic) means, proportions, standard deviations, differences and sums etc. in the context of statistical estimation theory.

It would be nice if it’d let me plug the values you would normally encounter if you calculate by hand, such as the arithmetic mean, n, N, p. etc.

Can you expand a bit on the application you have in mind? This way it’ll be easier to help you.

For example, if you have quantities with uncertainties and want to perform calculations with them you’re looking for error propagation in which case the mentioned Measurements.jl or perhaps MonteCarloMeasurements.jl would be helpful. On the other hand, if you have a time series of data - perhaps even correlated - and want to estimate the standard error you might want to take a look at BinningAnalysis.jl or similar.

2 Likes

We posted simultaneously. Please see my previous post.

Are your values statistically independent or correlated?

Assuming that I understand your use case correctly, you could simply use the functions provided by the Statistics standard library and StatsBase.jl. In the latter case, you could look at BinningAnalysis.jl linked above, which utilizes logarithmic binning to estimate the standard error of the mean of your values.

1 Like

Thanks for the input.

My use case is simple. It is like I explained earlier. Imagine you have formulas for estimating the population parameters with confidence intervals, based on sample statistics (such as sample means, proportion) and are calculating exactly according to these formulae. Is there then a package that has these formulae, or should I rather make them myself?

For example when you want to estimate a rate from a binomial sample, then the posterior on the rate is a Beta distribution, and you can compute the CI on that, but you need to know that it’s a Beta and how the data goes into it with the prior.

I don’t know of a package that does that easily, usually I do it myself using Distributions.jl and wikipedia, which is indeed not ideal.

In case you consider population parameters like mean or variance, there
is a number of confint distributed over Statistics, StatsBase and in particular HypothesisTests:

# 12 methods for generic function "confint":
[1] confint(x::BinomialTest; level, tail, method) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/binomial.jl:104
[2] confint(x::SignTest; level, tail) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/binomial.jl:218
[3] confint(x::FisherExactTest; level, tail, method) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/fisher.jl:181
[4] confint(x::PowerDivergenceTest; level, tail, method, correct, bootstrap_iters, GC) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/power_divergence.jl:73
[5] confint(obj::StatisticalModel) in StatsBase at /Users/smoritz/.julia/packages/StatsBase/DyWPR/src/statmodels.jl:32
[6] confint(x::HypothesisTests.TTest; level, tail) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/t.jl:37
[7] confint(x::HypothesisTests.ZTest; level, tail) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/z.jl:37
[8] confint(x::ExactSignedRankTest; level, tail) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/wilcoxon.jl:164
[9] confint(x::ApproximateSignedRankTest; level, tail) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/wilcoxon.jl:239
[10] confint(test::CorrelationTest{T}) where T in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/correlation.jl:61
[11] confint(test::CorrelationTest{T}, level::Float64) where T in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/correlation.jl:61

You can probably make https://github.com/MikeInnes/Poirot.jl do what you want?

I may be misunderstanding, but is sounds like you are under the impression that there is a single answer to the question of building confidence intervals. Unfortunately in Statistics this is not the case. The appropriate method in any application hinges critically on the true data generating process (e.g. is your data IID, weakly dependent, non-stationary, e.t.c.) in combination with the statistic of interest. We need to know these things to direct you to an appropriate package. If you are asking whether there is some package that universally builds appropriate confidence intervals for all data-types, well, I’m not aware of any such package in any programming language.

5 Likes

You may not be aware of this, but there is no single set of formulas that fits each application, except for some special cases under very, very specific assumptions.

There exist closed-form formulas for some models and methodologies, for some others you can obtain them numerically rather easily, and of course for other models you have to use Monte Carlo methods.

It would be easier to help if you specified

  • the statistical model you are interested in,
  • the methodology you are using (Bayesian? frequentist?)
  • the kind of confidence interval you want (frequentist CI, Bayesian HPD, etc)
1 Like

Thinking about it what’s missing it a way to get posterior on parameters when fitting a distribution, currently Distributions.jl’s fit only returns the MLE. That way if you want to estimate a frequency you could do something like that :

julia>dfit = my_fit(Binomial,100,[10])
Fitted{Binomial}(...)

julia>mle(dfit)
Binomial{Float64}(n=100, p=0.1)

julia>p = posterior(dfit, :p)
Beta{Float64}(α=10.0, β=90.0)

julia>confidence_interval(p, 0.9)
(0.05583217884206651, 0.15327514365732653)

In some relevant cases there’s closed-form formulas for the posteriors, but posterior could also return a sampled or approximated distribution when it’s not the case.

1 Like

And here, there is https://github.com/JuliaStats/ConjugatePriors.jl

2 Likes

For proportions you can see https://github.com/PharmCat/ClinicalTrialUtilities.jl or take code from there (ci.jl)

I wonder if you can suggest a book on confidence intervals of different models and methodologies…

I am not aware of an introductory textbook that compares various approaches — each one usually deals with its own. But if you are really interested, I would recommend

@article{berger1988likelihood,
  title={The likelihood principle},
  author={Berger, James O and Wolpert, Robert L and Bayarri, MJ and DeGroot, MH and Hill, Bruce M and Lane, David A and LeCam, Lucien},
  journal={Lecture notes-Monograph series},
  volume=6,
  year=1988,
  publisher={JSTOR}
}

which is great fun. Working through the book, you will learn a lot of useful facts about the principles of statistics, which you can weave into lunchtime conversations with colleagues up to the point that they will be inclined to dump a plate of lasagna on your head.

But the gist is really simple: frequentist (Neyman) CI usually don’t mean what people assume they mean, Bayesian HPD is a nice posterior visualization tool. I would go for posterior predictive checks instead for serious modeling, eg

@article{gelman1996posterior,
  title={Posterior predictive assessment of model fitness via realized discrepancies},
  author={Gelman, Andrew and Meng, Xiao-Li and Stern, Hal},
  journal={Statistica sinica},
  pages={733--760},
  year=1996,
  publisher={JSTOR}
}

Incidentally, Andrew Gelman has a lot of neat articles on p-values.

5 Likes

Thanks for your replies.

My use case is based on a table having z critical values and their respective probability percentages (between 0 and 1), where I use common values.

I want to make my own functions (or use existing ones) to calculate confidence intervals, proportions, means, and the like, going from sample to population. A part will be to plug in a z critical value into the function. Calculations are made (I have no time to write out these formulas here now, maybe later.) I output the calculated information nicely, including saying what the confidence percentage is, and so on.

And I want to also simply plug in the desired confidence level. So this value must be converted to the z critical value, which may then be used in computation.

Suppose you have a table such as this (I hope it renders well; I don’t use this so often):

% Conf. Lev.   .9973    .99        .98         68.27
z_c            3.00      2.58      2.33      1.00

etc.

If you’d input 3.00 for z_c, what is the formula to yield .9973? If you input .99, what is the formula to yield 2.58?

In this case, if I make those functions myself, and perhaps share them with others if there’s interest, I am particularly interested in these two formulas.

I think you want the quantile function from Distributions.jl:

julia> z = Normal()
Normal{Float64}(μ=0.0, σ=1.0)

julia> -quantile.(z, 0.5 .* (1 .- [.9973, .99, .98, .6827]))
4-element Array{Float64,1}:
 2.999976992703395
 2.5758293035489053
 2.326347874040846
 1.000021713322999

(note that to get the quantile from the 2-sided p value which you seem to be doing here, you have to adjust the p value to be half the distance from 1, hence the 0.5 .* (1 .- p) bit)

Thanks for the book recommendation, I just started reading it (“The likelihood principle”). I wish I had read it a few years earlier :see_no_evil:

1 Like