Package for Confidence Intervals?

Usor · January 28, 2020, 7:39pm

Is there a package dealing with these kinds of computations, or should I make my own functions for my purposes?

BeastyBlacksmith · January 28, 2020, 7:59pm

I use Measurements.jl.

Usor · January 28, 2020, 8:10pm

Thanks for the reply.

I was looking at the package and wondering if it would fit what I want. At the moment it doesn’t look like it, but I could be wrong. Can you elaborate, and possibly give examples?

At the moment I’m at least looking, for example, for something that would simply let me plug values into functions to get confidence intervals for (arithmetic) means, proportions, standard deviations, differences and sums etc. in the context of statistical estimation theory.

It would be nice if it’d let me plug the values you would normally encounter if you calculate by hand, such as the arithmetic mean, n, N, p. etc.

carstenbauer · January 28, 2020, 8:11pm

Can you expand a bit on the application you have in mind? This way it’ll be easier to help you.

For example, if you have quantities with uncertainties and want to perform calculations with them you’re looking for error propagation in which case the mentioned Measurements.jl or perhaps MonteCarloMeasurements.jl would be helpful. On the other hand, if you have a time series of data - perhaps even correlated - and want to estimate the standard error you might want to take a look at BinningAnalysis.jl or similar.

Usor · January 28, 2020, 8:13pm

We posted simultaneously. Please see my previous post.

carstenbauer · January 28, 2020, 8:14pm

Are your values statistically independent or correlated?

Assuming that I understand your use case correctly, you could simply use the functions provided by the Statistics standard library and StatsBase.jl. In the latter case, you could look at BinningAnalysis.jl linked above, which utilizes logarithmic binning to estimate the standard error of the mean of your values.

Usor · January 28, 2020, 8:28pm

Thanks for the input.

My use case is simple. It is like I explained earlier. Imagine you have formulas for estimating the population parameters with confidence intervals, based on sample statistics (such as sample means, proportion) and are calculating exactly according to these formulae. Is there then a package that has these formulae, or should I rather make them myself?

jonathanBieler · January 28, 2020, 9:40pm

For example when you want to estimate a rate from a binomial sample, then the posterior on the rate is a Beta distribution, and you can compute the CI on that, but you need to know that it’s a Beta and how the data goes into it with the prior.

I don’t know of a package that does that easily, usually I do it myself using Distributions.jl and wikipedia, which is indeed not ideal.

mschauer · January 28, 2020, 10:42pm

In case you consider population parameters like mean or variance, there
is a number of confint distributed over Statistics, StatsBase and in particular HypothesisTests:

# 12 methods for generic function "confint":
[1] confint(x::BinomialTest; level, tail, method) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/binomial.jl:104
[2] confint(x::SignTest; level, tail) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/binomial.jl:218
[3] confint(x::FisherExactTest; level, tail, method) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/fisher.jl:181
[4] confint(x::PowerDivergenceTest; level, tail, method, correct, bootstrap_iters, GC) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/power_divergence.jl:73
[5] confint(obj::StatisticalModel) in StatsBase at /Users/smoritz/.julia/packages/StatsBase/DyWPR/src/statmodels.jl:32
[6] confint(x::HypothesisTests.TTest; level, tail) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/t.jl:37
[7] confint(x::HypothesisTests.ZTest; level, tail) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/z.jl:37
[8] confint(x::ExactSignedRankTest; level, tail) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/wilcoxon.jl:164
[9] confint(x::ApproximateSignedRankTest; level, tail) in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/wilcoxon.jl:239
[10] confint(test::CorrelationTest{T}) where T in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/correlation.jl:61
[11] confint(test::CorrelationTest{T}, level::Float64) where T in HypothesisTests at /Users/smoritz/.julia/packages/HypothesisTests/wSEbN/src/correlation.jl:61

baggepinnen · January 29, 2020, 5:07am

You can probably make https://github.com/MikeInnes/Poirot.jl do what you want?

colintbowers · January 29, 2020, 5:52am

I may be misunderstanding, but is sounds like you are under the impression that there is a single answer to the question of building confidence intervals. Unfortunately in Statistics this is not the case. The appropriate method in any application hinges critically on the true data generating process (e.g. is your data IID, weakly dependent, non-stationary, e.t.c.) in combination with the statistic of interest. We need to know these things to direct you to an appropriate package. If you are asking whether there is some package that universally builds appropriate confidence intervals for all data-types, well, I’m not aware of any such package in any programming language.

Tamas_Papp · January 29, 2020, 9:29am

You may not be aware of this, but there is no single set of formulas that fits each application, except for some special cases under very, very specific assumptions.

There exist closed-form formulas for some models and methodologies, for some others you can obtain them numerically rather easily, and of course for other models you have to use Monte Carlo methods.

It would be easier to help if you specified

the statistical model you are interested in,
the methodology you are using (Bayesian? frequentist?)
the kind of confidence interval you want (frequentist CI, Bayesian HPD, etc)

jonathanBieler · January 29, 2020, 10:18am

Thinking about it what’s missing it a way to get posterior on parameters when fitting a distribution, currently Distributions.jl’s fit only returns the MLE. That way if you want to estimate a frequency you could do something like that :

julia>dfit = my_fit(Binomial,100,[10])
Fitted{Binomial}(...)

julia>mle(dfit)
Binomial{Float64}(n=100, p=0.1)

julia>p = posterior(dfit, :p)
Beta{Float64}(α=10.0, β=90.0)

julia>confidence_interval(p, 0.9)
(0.05583217884206651, 0.15327514365732653)

In some relevant cases there’s closed-form formulas for the posteriors, but posterior could also return a sampled or approximated distribution when it’s not the case.

mschauer · January 29, 2020, 10:45am

And here, there is GitHub - JuliaStats/ConjugatePriors.jl: A Julia package to support conjugate prior distributions.

PharmCat · February 5, 2020, 5:49pm

For proportions you can see https://github.com/PharmCat/ClinicalTrialUtilities.jl or take code from there (ci.jl)

affans · February 5, 2020, 7:55pm

I wonder if you can suggest a book on confidence intervals of different models and methodologies…

Tamas_Papp · February 6, 2020, 8:16am

I am not aware of an introductory textbook that compares various approaches — each one usually deals with its own. But if you are really interested, I would recommend

@article{berger1988likelihood,
  title={The likelihood principle},
  author={Berger, James O and Wolpert, Robert L and Bayarri, MJ and DeGroot, MH and Hill, Bruce M and Lane, David A and LeCam, Lucien},
  journal={Lecture notes-Monograph series},
  volume=6,
  year=1988,
  publisher={JSTOR}
}

which is great fun. Working through the book, you will learn a lot of useful facts about the principles of statistics, which you can weave into lunchtime conversations with colleagues up to the point that they will be inclined to dump a plate of lasagna on your head.

But the gist is really simple: frequentist (Neyman) CI usually don’t mean what people assume they mean, Bayesian HPD is a nice posterior visualization tool. I would go for posterior predictive checks instead for serious modeling, eg

@article{gelman1996posterior,
  title={Posterior predictive assessment of model fitness via realized discrepancies},
  author={Gelman, Andrew and Meng, Xiao-Li and Stern, Hal},
  journal={Statistica sinica},
  pages={733--760},
  year=1996,
  publisher={JSTOR}
}

Incidentally, Andrew Gelman has a lot of neat articles on p-values.

Usor · February 16, 2020, 3:15pm

Thanks for your replies.

My use case is based on a table having z critical values and their respective probability percentages (between 0 and 1), where I use common values.

I want to make my own functions (or use existing ones) to calculate confidence intervals, proportions, means, and the like, going from sample to population. A part will be to plug in a z critical value into the function. Calculations are made (I have no time to write out these formulas here now, maybe later.) I output the calculated information nicely, including saying what the confidence percentage is, and so on.

And I want to also simply plug in the desired confidence level. So this value must be converted to the z critical value, which may then be used in computation.

Suppose you have a table such as this (I hope it renders well; I don’t use this so often):

% Conf. Lev.   .9973    .99        .98         68.27
z_c            3.00      2.58      2.33      1.00

etc.

If you’d input 3.00 for z_c, what is the formula to yield .9973? If you input .99, what is the formula to yield 2.58?

In this case, if I make those functions myself, and perhaps share them with others if there’s interest, I am particularly interested in these two formulas.

dave.f.kleinschmidt · February 16, 2020, 10:08pm

I think you want the quantile function from Distributions.jl:

julia> z = Normal()
Normal{Float64}(μ=0.0, σ=1.0)

julia> -quantile.(z, 0.5 .* (1 .- [.9973, .99, .98, .6827]))
4-element Array{Float64,1}:
 2.999976992703395
 2.5758293035489053
 2.326347874040846
 1.000021713322999

(note that to get the quantile from the 2-sided p value which you seem to be doing here, you have to adjust the p value to be half the distance from 1, hence the 0.5 .* (1 .- p) bit)

tamasgal · February 16, 2020, 10:42pm

Thanks for the book recommendation, I just started reading it (“The likelihood principle”). I wish I had read it a few years earlier

Topic		Replies	Views
Packages to determine confidence intervals Statistics fit	1	606	June 12, 2020
How to compute/calculate a Confidence Interval (CI)? General Usage statistics , distributions , hypothesis-tests	4	3051	May 16, 2023
How to calculate confidence interval? General Usage question , package	5	354	May 16, 2023
Convert between z critical values and confidence levels? Statistics	1	397	January 29, 2020
MixedModels.jl: how to get confidence intervals? General Usage question	0	301	September 16, 2020

Package for Confidence Intervals?

Related topics