Help with optimizing a particle filter in SciML

Hi everyone,

I’m trying to put together a demonstration of a very simple particle filter for my repository https://github.com/epirecipes/sir-julia. This involves:

  1. Setting up an array of simulations.
  2. Simulating for a single timestep.
  3. Resampling the simulations that match the observed data
  4. Compute a partial likelihood as the proportion of simulations that are consistent with the data (as likelihoods are either 0 or 1).
  5. If there are more timepoints, go back to (2).
  6. Compute the

I need something that works for both in-place and out-of-place versions. My current code is below. A couple of questions:

  1. How can I reduce allocations/increase speed? Using set_u! instead of reinit! makes the code faster, but then I have to set save_everystep to false, and then the in-place version doesn’t work.
  2. Why are the results different for the in-place and out-of-place versions?
julia> @time pfilter(prob_op, p, u0, X, 100000, 1234)
  8.030784 seconds (46.90 M allocations: 3.463 GiB, 9.35% gc time)
-113.48234265521722

julia> @time pfilter(prob_ip, p, u0, X, 100000, 1234)
  8.026255 seconds (39.00 M allocations: 3.210 GiB, 8.44% gc time)
-113.42214586062131
# Load packages
using OrdinaryDiffEq
using Random
using Distributions
using StatsBase

# Set seed
Random.seed!(1)

# Define a model for a single time-step. This is an out-of-place version.
# C is the number of cases per timestep, which is reset for each iteration.
function sir_markov(u,p,t)
    (S,I,_) = u
    C=0
    (β,γ)=p
    p
    δt=0.1
    nsteps=10
    for i in 1:nsteps
        ifrac = 1-exp(-β*I*δt)
        rfrac = 1-exp(-γ*δt)
        infection=rand(Binomial(S,ifrac))
        recovery=rand(Binomial(I,rfrac))
        S = S-infection
        I = I+infection-recovery
        C = C+infection
    end
   [S,I,C]
end

# In place version of the above
function sir_markov!(du,u,p,t)
    (S,I,C) = u
    C=0
    (β,γ)=p
    δt=0.1
    nsteps=10
    for i in 1:nsteps
        ifrac = 1-exp(-β*I*δt)
        rfrac = 1-exp(-γ*δt)
        infection=rand(Binomial(S,ifrac))
        recovery=rand(Binomial(I,rfrac))
        S = S-infection
        I = I+infection-recovery
        C = C+infection
    end
   @inbounds begin
    du[1] = S
    du[2] = I
    du[3] = C
   end
end

## Time, initial conditions, and parameter values
tspan = (0,40)
u0 = [990, 10, 0] # S, I, C
p = [0.0005, 0.25] # β, γ

## Solve for a single run to generate simulated data
prob_ip = DiscreteProblem(sir_markov!, u0, tspan, p, dt=1)
prob_op = DiscreteProblem(sir_markov, u0, tspan, p, dt=1)
sol_op = solve(prob_op, FunctionMap())
X = hcat(sol_op.u...)[3,2:end]

# Simple particle filter
function pfilter(prob, p, u0, X, trajectories=1000, seed=1234)
    prob = remake(prob, p=p, u0=u0)
    integrators = [init(prob, FunctionMap(),save_everystep=true) for i in 1:trajectories]
    Random.seed!(seed)
    liks = zeros(Float64,length(X))
    weights = Weights(zeros(Float64,trajectories))
    us = [copy(u0) for i in 1:trajectories]
    idx = collect(1:trajectories)
    @inbounds for t in 1:length(X)
        step!.(integrators)
        x = X[t]
        [us[i] = integrators[i].u for i in 1:trajectories]
        [weights[i]=Float64(us[i][3]==x) for i in 1:trajectories]
        liks[t] = mean(weights)
        if mean(weights)==0.0
            return -Inf
            break
        end
        sample!(1:trajectories, weights, idx)
        [reinit!(integrators[i],us[idx[i]]) for i in 1:trajectories]
    end
    sum(log.(liks))
end

pfilter(prob_op, p, u0, X, 100000, 1234)
@time pfilter(prob_op, p, u0, X, 100000, 1234)
@time pfilter(prob_ip, p, u0, X, 100000, 1234)

Not sure about the whole system, but your in-place version should only return the change in du not the new u by summation, documentation here: Discrete Problems · DifferentialEquations.jl

Why are you iterating inside sir_markov? Usually, that should only advance one time step and produce the change to be accumulated by the solver.

Hi @contradict.

That certainly isn’t the way that I read the documentation - I thought that du was the new state for a DiscreteProblem with a FunctionMap solver.

Iterating inside the solver was the simplest way to get solutions where one variable (C) is reset every dt=1. In this way, one time step in the solver turns into a single step in the particle filter. I originally tried running the model with multiple iterations for the step! function, as well as trying callbacks, but the code rapidly got clunky for this simple example, and I was finding it difficult to debug.

Huh, OK, I think I don’t understand the problem you are trying to solve well enough to help, sorry for the noise!.

No worries! In case it helps, here’s a version without SciML, which is much less resource-intensive. If I use the same data as above, I get the following (a 2x speedup and a 10x reduction in allocations).

 julia> @time pfilter(p, u0, X, 100000, 1234)
  4.759976 seconds (4.20 M allocations: 505.841 MiB, 1.09% gc time)
-113.48234265521722
# Load packages
using Random
using Distributions
using StatsBase
using Plots

# Set seed
Random.seed!(1)

# Define a model for a single time-step. This is an out-of-place version.
# C is the number of cases per timestep, which is reset for each iteration.
function sir_markov(u,p,t)
    (S,I,_) = u
    C=0
    (β,γ)=p
    p
    δt=0.1
    nsteps=10
    for i in 1:nsteps
        ifrac = 1-exp(-β*I*δt)
        rfrac = 1-exp(-γ*δt)
        infection=rand(Binomial(S,ifrac))
        recovery=rand(Binomial(I,rfrac))
        S = S-infection
        I = I+infection-recovery
        C = C+infection
    end
   [S,I,C]
end

## Time, initial conditions, and parameter values
tspan = (0,40)
u0 = [990, 10, 0] # S, I, C
p = [0.0005, 0.25] # β, γ

function simulate(p,u0,nsteps,dt)
    u = copy(u0)
    du = copy(u0)
    t = 0.0
    ta = []
    ua = []
    push!(ta,t)
    push!(ua,copy(u))
    for t in 1:nsteps
        du = sir_markov(u,p,t)
        u,du = du,u
        t = t + dt
        push!(ta,t)
        push!(ua,copy(u))
    end
    (ta,ua)
end

sol_ip = simulate(p, u0, 40, 1.0)
X = hcat(sol_ip[2]...)[3,2:end]

# Simple particle filter
function pfilter(p, u0, X, trajectories=1000, seed=1234)
    Random.seed!(seed)
    liks = zeros(Float64,length(X))
    weights = Weights(zeros(Float64,trajectories))
    us = [copy(u0) for i in 1:trajectories]
    dus = [copy(u0) for i in 1:trajectories]
    @inbounds for t in 1:length(X)
        [dus[i] = sir_markov(us[i],p,t) for i in 1:trajectories]
        x = X[t]
        [weights[i]=Float64(dus[i][3]==x) for i in 1:trajectories]
        liks[t] = mean(weights)
        if mean(weights)==0.0
            return -Inf
            break
        end
        sample!(dus, weights, us)
    end
    sum(log.(liks))
end

pfilter(p, u0, X, 100000, 1234)
@time pfilter(p, u0, X, 100000, 1234)

Most of the time is taken in the sir_markov function. Specifically, in calculation of random Binomial samples.

One method to speed it up is to use a Poisson approximation, i.e. replace Binomial(S, ifrac) with Poisson(S*ifrac). The new function could look like:

# Define a model for a single time-step. This is an out-of-place version.
# C is the number of cases per timestep, which is reset for each iteration.
function sir_markov(u,p,t)
    (S,I,_) = u
    C=0
    (β,γ)=p
    p
    δt=0.1
    nsteps=10
    for i in 1:nsteps
        ifrac = 1-exp(-β*I*δt)
        rfrac = 1-exp(-γ*δt)
        infection=min(S,rand(Poisson(ifrac*S)))
        recovery=min(I,rand(Poisson(rfrac*I)))
        S = S-infection
        I = I+infection-recovery
        C = C+infection
    end
    [S,I,C]
end

In my test it gave about 40% speedup.

Thanks @Dan

I was trying to avoid changing the model specification, to see what other gains could be made - if you swap out the standard rand(Poisson(rate)) for pois_rand in Home · PoissonRandom.jl then you get a further bump in performance.

My non-SciML version:

  4.780279 seconds (4.20 M allocations: 505.841 MiB, 0.75% gc time)
-113.52223905672342

Using Random.jl and Distributions.jl:

  3.043553 seconds (4.20 M allocations: 505.841 MiB, 1.27% gc time)
-113.38314569733771

Using PoissonRandom.jl:

  2.638001 seconds (4.20 M allocations: 505.841 MiB, 1.26% gc time)
-113.32727917893702

It’s nice to know that some further approximations and slight changes in implementation can give a near 2x speedup, but I would like to use the SciML infrastructure for non-toy problems, so trying to reduce the overhead would be good. I’m also trying to understand why the in-place and out-of-place versions give different results…

Measuring the overhead is also an important step. I used ProfileView on pfilter and >70% of time was on sir_markov, and in it, on the random Binomial sampling. So without optimizing this function it would be hard to have dramatic gains.

Did you profile on your machine? Are the results different?

There is an obvious trade-off between δt and nsteps. Perhaps a numerical approach would adaptively evaluate the approximation of using longer δt and reduce nsteps when possible.

Hi @Dan,

In the thread, the difference between using SciML and just calling a function is a difference between 8s and 4.8s with the same underlying model. The overhead used to be a big problem, which is why I wrote GitHub - sdwfrost/Gillespie.jl: Stochastic Gillespie-type simulations using Julia for doing stochastic simulations in continuous time (the gap has now closed between this and using JumpProblem), so I’m curious as to why it’s taking up so much time here.

As for the model itself, there are all sorts of tricks to speed it up - see Comparing simple simulations in Julia, Nim, C++ and R · GitHub for a thread that compares Julia with other languages for a similar model.