The cognitive bias of low-probability events

Just a tough reading the news about Madonna endorsing hydroxychloroquine.

The problem is when one has to judge the difference between distributions that differ in the tail, like a Bernoulli with p=0.95 and p=0.99.

People, individually, don’t have experience for that, only “big trials” can distinguish between a treatment for coronavirus that leads 95% to recover and a placebo where 90% recovers.
The same for homeopathy: flu/colds are largely benign, it is really hard to judge the progress of a “real” medicine.

In all these cases fake treatments that exploit these cognitive bias can emerge.


Just like the (in)famous machine learning example. ML model that identifies tumors has an accuracy of 85%. But 90% of samples have tumor, so a simpler model which just returns true has an accuracy of 90%.

1 Like

The severest problem is the decreasing trust in science, which is fueled by well known leaders these days.


While I agree that in general you should not take medical advice from pop stars (doh), it is interesting to just plot the numbers. Specifically, let’s take a model of n Bernioulli(\alpha) draws with a flat prior, which ends up as a Beta posterior.

The graph below shows the log pdf of the success rate under a “true” \alpha of 0.95 (red) and 0.99 (blue) using dashed lines for n = 300, while the solid lines are the log posteriors. With only a few hundred draws, you can discriminate between these two pretty well under general circumstances. Of course in real life you have covariates, confounding variables, etc.

code for plot
using PGFPlotsX, Distributions
function plot_beta(color, α, n)
    x = range(0.9, 1; length = 200)
    @pgf (Plot({ color = color, no_marks },
               Table(x, logpdf.(Ref(Beta(α * n, (1 - α) * n)), x))),
          VLine({ color = color, dotted }, α),
          Plot({ color = color, dashed },
               Table(x, logpdf.(Ref(Binomial(n, α)), round.(Int, n .* x)), x)))

@pgf Axis({ xlabel = "α", ylabel = "log posterior", ymajorgrids, ymin = -20 },
          plot_beta("red", 0.95, 300)...,
          plot_beta("blue", 0.99, 300)...)

…I think 300 draws of whatever medical condition are behind the experience of most individuals :slight_smile: :slight_smile:

1 Like

Sure, but inference about most medical conditions based on one’s personal experience is pretty nonsensical as the sample size is very small (except for recurring conditions like seasonal colds etc). I don’t think that’s the relevant baseline here.

Even if they have no formal training in statistics, I guess most people who are otherwise functionally literate would do OK with simple tabulations when sample sizes are comparable to a typical RCT, even if the probabilities are near 0.99.

IMO the problematic range is near 10^{-3} or 10^{-4}; for example this is the annual probability for a lot of eminently preventable household accidents involving children or adults. But these are low enough so that a typical person may not personally know of a case in their own social network.

Of course as @oheil pointed out, this whole thing may have little to do with the pseudoscience surrounding COVID-19. My point was simply that I disagree with your claim that