Conformal Predictive distributions

It’s always possible to intentionally make Bayes arbitrarily bad… Suppose the simple case you want to estimate what the outcome will be from a normal random number generator with mean \mu unknown and known stddev \sigma=1.

Suppose that the \mu is some number like 100 ish… You could make a reasonable prior for \mu that includes 100 in the support, like maybe mu ~ Gamma(20.0,80.0/19)


Or if you are ornery or adversarial you could make your prior be a dirac \delta(1.37\times 10^{231}). This basically guarantees none of the intervals you will ever generate will contain any of the actual data. And no amount of data collection will help.

So the moral of the story is that Bayes’s job is to tell you the consequences of your assumptions, which it does. It does not do anything to prevent you from making insane assumptions.

With reasonable assumptions and a lot of data Bayesian models of random number generators will generally converge to precise truth and have asymptotic frequency coverage.

Our conversation digressed from my initial questions, so I’ll try to rephrase them more precisely.

Using training data D_{n} \equiv \{ (X_i, Y_i)\}_{i=1}^{n} , fit a model Y_{i} = f(X_{i}) .
Next, given a new X_{n+1}, estimate a prediction interval C_{1-\alpha} = \left[ L_{1-\alpha}, U_{1-\alpha}\right]
with P( Y_{n+1} \in C_{1-\alpha}) \in \left[ 1-\alpha, 1-\alpha +\frac{1}{n+1} \right] for \alpha \in [0,1] .
Let F(\cdot) be the CDF of Y_{n+1}|X_{n+1}

  1. Under what conditions do we have:
    L_{1-2q} =F^{-1}(q) for quantile q \in [0,0.5]
    U_{2q-1} =F^{-1}(q) for quantile q \in [0.5,1]
  2. What do we know about the distribution of the statistic L_{1-2q}?

It’s not hard to imagine valid prediction intervals (in the frequentist sense above) that don’t have this property.
For example, consider a prediction interval C_{90\%}=[L_{90\%},U_{90\%}], where P( Y_{n+1} \leq L_{90\%}) =2\% and P( Y_{n+1} \leq U_{90\%}) = 92\%.
In this case, L_{90\%} =F^{-1}(2\%) \neq F^{-1}(5\%)
However we still have P(Y_{n+1}\in C_{90\%}) = 90\% (the prediction interval is still valid in the frequentist sense).