[ANN-RFC] KissABC.jl- Approximate Bayesian Computation

Hi everyone i made a very self-contained package to efficiently perform Approximate Bayesian Computation, this package implements a hybrid MCMC sampler with 3 different affine invariant proposals, solving many of the shortcomings of other MCMC algorithms (adaptive and non adaptive)

this package compared to the alternatives in the julia ecosystem offers:

  • Ability to sample from mixed support priors (Discrete and Continuous)
  • Ability to easily implement support for custom types of ABC likelihoods
  • supports both kernelized ABC and threshold ABC
  • Very efficient MCMC algorithm with proposals which are insensitive to scale and orientation of the target distribution (Affine Invariant sampling)
  • A super simple unit tested code, which can be easily extended with more algorithms and features

it does not offer (and will never offer):

  • surrogate models to speed up ABC
  • automatic distance functions between datasets

this package at the moment is not in Julia Registry, if there is some interest i will register it :slight_smile:

the package is now registered, add it via

import Pkg
Pkg.add("KissABC")

the repository can be found at

26 Likes
1 Like

Would be cool to interface this with Turing if possible!

2 Likes

i would love for this to happen: Turing + ABC

5 Likes

I just gave it a try with a toy exponential mixture model. It works well and produces similar results to ApproxBayes.jl’s ABCSMC, but seems to be bit less sample-efficient.

When using the same \epsilon, KissABC.jl uses about 4x more samples, but produces a posterior with tighter 95% intervals (I’m guessing this is related to ApproxBayes.jl stopping early because of the default tolerance). When I increase the \epsilon for KissABC.jl to get a similar posterior, it still requires 2.5x more samples than ApproxBayes.jl.

Would you expect better results? Is it just a case of not working as well on this specific model?

function sim((u1, p1), params; n=10^6, raw=false)
 u2 = (1.0 - u1*p1)/(1.0 - p1)
 x = randexp(n) .* ifelse.(rand(n) .< p1, u1, u2)
 raw && return x
 [std(x), median(x)]
end

function dist(s, s0)
 sqrt(sum(((s .- s0)./s).^2))
end

ABCSMCPR(Factored(Uniform(0,1), Uniform(0.5,1)), sim, [2.2, 0.4], dist, 0.01, nparticles=100, parall
el=true)

@robsmith11
you can tune parameter c to obtain comparable sampling costs, i had direct experience with the sampler from Toni et al. 2009 implemented in ApproxBayes.jl and it suffers from degeneracy of weights, so the lesser costs in CPU time can turn out to be due to incorrect sampling

in any case
in KissABC.jl the sampling cost is tunable, a higher value of c which defaults to 0.01 will allow the algorithm to spend less time on individual samples, the quality of the posterior will not suffer a lot, but there will be some sample duplication

Ok, thanks. I really don’t know much about the details of the SMC alogrithms. It usually produces good results for my models, so I’ve been using it. I did have some problems with the posteriors produced using ApproxBayes.jl with some more complex models, so maybe that’s related to what you mentioned. I’ll experiment more later. It’s great to have multiple implementations to compare!

1 Like

Thanks, cool package! Would it be possible to add support to parallelize the particle computation over worker processes (ala Distributed.jl) rather than just threads? It seems like your code uses @threads internally.

1 Like

it can be done trivially i think :slight_smile: , just run multiple SMC processes one for each worker just with fewer particles at the end collect all the results and join them

1 Like

i just implemented a WIP version of ABC SMC - Differential Evolution, it is scary fast, give it a spin :wink:

ABCDE is the method name

3 Likes

Nice! It is indeed very fast for me. With the same settings and the mixture model, it’s 10x faster than ABCSMCPR and 3x faster than ApproxBayes. If it works as well for more complex models, it will be quite a nice speed up for some of my work.

You used the algorithm from Turner et al. 2012 for ABCDE? Are there any downsides relative to SMC of which to be careful?

1 Like

instead of a square accept/reject kernel (ie. distance stat within epsilon) can it handle smooth kernels?

not that i know of :slight_smile:

at the moment no, as smooth kernels tend to complicate the algorithms, i still have to try them.

I started a similar effort some time ago here:

I didn’t include the Drovandi et al. (2011) method, because I thought for some reason that the similar method by Del Moral et al. (2011) was somewhat superior. Anyway, I usually find kernel-based ABC methods (also included in the package) to be much more sample efficient than MCMC-ABC methods. @francesco.alemanno would you be interested in joining forces to write a really nice and feature-rich ABC package? I like collaborative projects :smile:.

1 Like

@jbrea that would be nice! a full fledged powerful ABC package, i would also like to collaborate :slight_smile:

i just implemented a WIP version of ABC SMC - Differential Evolution, it is scary fast, give it a spin :wink:
ABCDE is the method name

Wow, it is really fast. That’s awesome, thank you!

1 Like

The package is now registered, have fun! and remember to let me know if you find issues or obvious limitations that might have gone unnoticed :smiley:

2 Likes

Thanks for the very interesting package :+1:. I have been playing around a bit with the mixture model in the documentation. I find that the confidence intervals seem to be too broad, the true parameters are inside more than 90% of the times it is repeated, in my limited experimentation. For example, running the code attached below, which uses 10 repetitions, I get, in three runs

mean(inci, dims = 1) = [1.0 1.0 1.0 0.8 1.0]
mean(inci, dims = 1) = [1.0 1.0 1.0 1.0 0.9]
mean(inci, dims = 1) = [1.0 1.0 1.0 0.9 1.0]


Each time this is run, we compute 10 CIs for each of 5 parameters. So, in the three repetitions, 150 CIs were computed. There were 4 rejections of a true parameter, when we would expect 15. I was wondering if there’s a good way to tune this to try to make the CI coverage more accurate. The code I ran follows. It uses the code for the mixture model, but with a small addition to keep track of the confidence interval results. Thanks again!

#using Pkg
#Pkg.activate("./")
using KissABC
using Distributions
using Plots

function model(P,N)
    μ_1, μ_2, σ_1, σ_2, prob=P
    d1=randn(N).*σ_1 .+ μ_1
    d2=randn(N).*σ_2 .+ μ_2
    ps=rand(N).<prob
    R=zeros(N)
    R[ps].=d1[ps]
    R[.!ps].=d2[.!ps]
    R
end

function D(x,y)
    r=0:0.01:1
    sum(abs2,quantile.(Ref(x),r).-quantile.(Ref(y),r))/length(r)
end

function getstats(P,V)
    (
        param=P,
        median=median(V),
        lowerbound=quantile(V,0.05),
        upperbound=quantile(V,0.95)
    )
end

function main(reps)
inci = zeros(reps,5) # holder for true parameters in 90% CI
for r = 1:reps
    println("repetition ", r)
    parameters = (1.0, 0.0, 0.2, 2.0, 0.4)
    data=model(parameters,5000)
    #histogram(data)
    prior=Factored(
                Uniform(0,3), # there is surely a peak between 0 and 3
                Uniform(-2,2), #there is a smeared distribution centered around 0
                Uniform(0,1), # the peak has surely a width below 1
                Uniform(0,4), # the smeared distribution surely has a width less than 4
                Beta(4,4) # the number of total events from both distributions look about the same, so we will favor 0.5 just a bit
            );
    plan=ABCplan(prior,model,data,D,params=5000)
    res,Δ=ABCDE(plan,0.02,parallel=true,verbose=false);
    stats=getstats.((:μ_1, :μ_2, :σ_1, :σ_2, :prob),[getindex.(res,i) for i in 1:5])

    for is in eachindex(stats)
        #println(parameters[is], " → ", stats[is])
        inci[r, is] = stats[is][3]<parameters[is] && stats[is][4]>parameters[is]
    end
end

println("90% CI coverage")
@show mean(inci,dims=1)
return nothing
end

2 Likes

Thank you for your feedback! :slight_smile: glad you find it interesting.
ABC methods most of the time suffer from having wider CI’s than exact methods, since both the imperfect distance metric, and the non-zero tolerance tend to affect CI’s. i think the example can be improved by considering a Kolmogorov-Smirnov distance and a lower tolerance.
by the way the example has been improved a tiny bit in the most recent docs, the most notable difference is the use of abs instead of abs2.

Bayesian methods are not Confidence Procedures. Bayesian probability does not measure frequency of occurrence, it measures plausibility under assumptions.