# Help with optimizing a particle filter in SciML

Hi everyone,

I’m trying to put together a demonstration of a very simple particle filter for my repository https://github.com/epirecipes/sir-julia. This involves:

1. Setting up an array of simulations.
2. Simulating for a single timestep.
3. Resampling the simulations that match the observed data
4. Compute a partial likelihood as the proportion of simulations that are consistent with the data (as likelihoods are either 0 or 1).
5. If there are more timepoints, go back to (2).
6. Compute the

I need something that works for both in-place and out-of-place versions. My current code is below. A couple of questions:

1. How can I reduce allocations/increase speed? Using `set_u!` instead of `reinit!` makes the code faster, but then I have to set `save_everystep` to `false`, and then the in-place version doesn’t work.
2. Why are the results different for the in-place and out-of-place versions?
``````julia> @time pfilter(prob_op, p, u0, X, 100000, 1234)
8.030784 seconds (46.90 M allocations: 3.463 GiB, 9.35% gc time)
-113.48234265521722

julia> @time pfilter(prob_ip, p, u0, X, 100000, 1234)
8.026255 seconds (39.00 M allocations: 3.210 GiB, 8.44% gc time)
-113.42214586062131
``````
``````# Load packages
using OrdinaryDiffEq
using Random
using Distributions
using StatsBase

# Set seed
Random.seed!(1)

# Define a model for a single time-step. This is an out-of-place version.
# C is the number of cases per timestep, which is reset for each iteration.
function sir_markov(u,p,t)
(S,I,_) = u
C=0
(β,γ)=p
p
δt=0.1
nsteps=10
for i in 1:nsteps
ifrac = 1-exp(-β*I*δt)
rfrac = 1-exp(-γ*δt)
infection=rand(Binomial(S,ifrac))
recovery=rand(Binomial(I,rfrac))
S = S-infection
I = I+infection-recovery
C = C+infection
end
[S,I,C]
end

# In place version of the above
function sir_markov!(du,u,p,t)
(S,I,C) = u
C=0
(β,γ)=p
δt=0.1
nsteps=10
for i in 1:nsteps
ifrac = 1-exp(-β*I*δt)
rfrac = 1-exp(-γ*δt)
infection=rand(Binomial(S,ifrac))
recovery=rand(Binomial(I,rfrac))
S = S-infection
I = I+infection-recovery
C = C+infection
end
@inbounds begin
du = S
du = I
du = C
end
end

## Time, initial conditions, and parameter values
tspan = (0,40)
u0 = [990, 10, 0] # S, I, C
p = [0.0005, 0.25] # β, γ

## Solve for a single run to generate simulated data
prob_ip = DiscreteProblem(sir_markov!, u0, tspan, p, dt=1)
prob_op = DiscreteProblem(sir_markov, u0, tspan, p, dt=1)
sol_op = solve(prob_op, FunctionMap())
X = hcat(sol_op.u...)[3,2:end]

# Simple particle filter
function pfilter(prob, p, u0, X, trajectories=1000, seed=1234)
prob = remake(prob, p=p, u0=u0)
integrators = [init(prob, FunctionMap(),save_everystep=true) for i in 1:trajectories]
Random.seed!(seed)
liks = zeros(Float64,length(X))
weights = Weights(zeros(Float64,trajectories))
us = [copy(u0) for i in 1:trajectories]
idx = collect(1:trajectories)
@inbounds for t in 1:length(X)
step!.(integrators)
x = X[t]
[us[i] = integrators[i].u for i in 1:trajectories]
[weights[i]=Float64(us[i]==x) for i in 1:trajectories]
liks[t] = mean(weights)
if mean(weights)==0.0
return -Inf
break
end
sample!(1:trajectories, weights, idx)
[reinit!(integrators[i],us[idx[i]]) for i in 1:trajectories]
end
sum(log.(liks))
end

pfilter(prob_op, p, u0, X, 100000, 1234)
@time pfilter(prob_op, p, u0, X, 100000, 1234)
@time pfilter(prob_ip, p, u0, X, 100000, 1234)
``````

Not sure about the whole system, but your in-place version should only return the change in `du` not the new `u` by summation, documentation here: Discrete Problems · DifferentialEquations.jl

Why are you iterating inside `sir_markov`? Usually, that should only advance one time step and produce the change to be accumulated by the solver.

That certainly isn’t the way that I read the documentation - I thought that `du` was the new state for a `DiscreteProblem` with a `FunctionMap` solver.

Iterating inside the solver was the simplest way to get solutions where one variable (C) is reset every dt=1. In this way, one time step in the solver turns into a single step in the particle filter. I originally tried running the model with multiple iterations for the step! function, as well as trying callbacks, but the code rapidly got clunky for this simple example, and I was finding it difficult to debug.

Huh, OK, I think I don’t understand the problem you are trying to solve well enough to help, sorry for the noise!.

No worries! In case it helps, here’s a version without SciML, which is much less resource-intensive. If I use the same data as above, I get the following (a 2x speedup and a 10x reduction in allocations).

`````` julia> @time pfilter(p, u0, X, 100000, 1234)
4.759976 seconds (4.20 M allocations: 505.841 MiB, 1.09% gc time)
-113.48234265521722
``````
``````# Load packages
using Random
using Distributions
using StatsBase
using Plots

# Set seed
Random.seed!(1)

# Define a model for a single time-step. This is an out-of-place version.
# C is the number of cases per timestep, which is reset for each iteration.
function sir_markov(u,p,t)
(S,I,_) = u
C=0
(β,γ)=p
p
δt=0.1
nsteps=10
for i in 1:nsteps
ifrac = 1-exp(-β*I*δt)
rfrac = 1-exp(-γ*δt)
infection=rand(Binomial(S,ifrac))
recovery=rand(Binomial(I,rfrac))
S = S-infection
I = I+infection-recovery
C = C+infection
end
[S,I,C]
end

## Time, initial conditions, and parameter values
tspan = (0,40)
u0 = [990, 10, 0] # S, I, C
p = [0.0005, 0.25] # β, γ

function simulate(p,u0,nsteps,dt)
u = copy(u0)
du = copy(u0)
t = 0.0
ta = []
ua = []
push!(ta,t)
push!(ua,copy(u))
for t in 1:nsteps
du = sir_markov(u,p,t)
u,du = du,u
t = t + dt
push!(ta,t)
push!(ua,copy(u))
end
(ta,ua)
end

sol_ip = simulate(p, u0, 40, 1.0)
X = hcat(sol_ip...)[3,2:end]

# Simple particle filter
function pfilter(p, u0, X, trajectories=1000, seed=1234)
Random.seed!(seed)
liks = zeros(Float64,length(X))
weights = Weights(zeros(Float64,trajectories))
us = [copy(u0) for i in 1:trajectories]
dus = [copy(u0) for i in 1:trajectories]
@inbounds for t in 1:length(X)
[dus[i] = sir_markov(us[i],p,t) for i in 1:trajectories]
x = X[t]
[weights[i]=Float64(dus[i]==x) for i in 1:trajectories]
liks[t] = mean(weights)
if mean(weights)==0.0
return -Inf
break
end
sample!(dus, weights, us)
end
sum(log.(liks))
end

pfilter(p, u0, X, 100000, 1234)
@time pfilter(p, u0, X, 100000, 1234)
``````

Most of the time is taken in the `sir_markov` function. Specifically, in calculation of random Binomial samples.

One method to speed it up is to use a Poisson approximation, i.e. replace `Binomial(S, ifrac)` with `Poisson(S*ifrac)`. The new function could look like:

``````# Define a model for a single time-step. This is an out-of-place version.
# C is the number of cases per timestep, which is reset for each iteration.
function sir_markov(u,p,t)
(S,I,_) = u
C=0
(β,γ)=p
p
δt=0.1
nsteps=10
for i in 1:nsteps
ifrac = 1-exp(-β*I*δt)
rfrac = 1-exp(-γ*δt)
infection=min(S,rand(Poisson(ifrac*S)))
recovery=min(I,rand(Poisson(rfrac*I)))
S = S-infection
I = I+infection-recovery
C = C+infection
end
[S,I,C]
end
``````

In my test it gave about 40% speedup.

Thanks @Dan

I was trying to avoid changing the model specification, to see what other gains could be made - if you swap out the standard `rand(Poisson(rate))` for `pois_rand` in Home · PoissonRandom.jl then you get a further bump in performance.

My non-SciML version:

``````  4.780279 seconds (4.20 M allocations: 505.841 MiB, 0.75% gc time)
-113.52223905672342
``````

Using `Random.jl` and `Distributions.jl`:

``````  3.043553 seconds (4.20 M allocations: 505.841 MiB, 1.27% gc time)
-113.38314569733771
``````

Using `PoissonRandom.jl`:

``````  2.638001 seconds (4.20 M allocations: 505.841 MiB, 1.26% gc time)
-113.32727917893702
``````

It’s nice to know that some further approximations and slight changes in implementation can give a near 2x speedup, but I would like to use the SciML infrastructure for non-toy problems, so trying to reduce the overhead would be good. I’m also trying to understand why the in-place and out-of-place versions give different results…

Measuring the overhead is also an important step. I used ProfileView on `pfilter` and >70% of time was on `sir_markov`, and in it, on the random Binomial sampling. So without optimizing this function it would be hard to have dramatic gains.

Did you profile on your machine? Are the results different?

There is an obvious trade-off between `δt` and `nsteps`. Perhaps a numerical approach would adaptively evaluate the approximation of using longer `δt` and reduce `nsteps` when possible.

Hi @Dan,

In the thread, the difference between using SciML and just calling a function is a difference between 8s and 4.8s with the same underlying model. The overhead used to be a big problem, which is why I wrote GitHub - sdwfrost/Gillespie.jl: Stochastic Gillespie-type simulations using Julia for doing stochastic simulations in continuous time (the gap has now closed between this and using `JumpProblem`), so I’m curious as to why it’s taking up so much time here.

As for the model itself, there are all sorts of tricks to speed it up - see Comparing simple simulations in Julia, Nim, C++ and R · GitHub for a thread that compares Julia with other languages for a similar model.