Stochastic Differential Equations -- basic question

I’m making my first attempt of Stochastic Differential Equations using a SIR model, and follow the stochastic formulation described in Keeling and Rohani (2008): Modeling Infectious Diseases in Humans and Animals, Princeton University Press.

My questions (at the bottom!) relates to the description of the noise + the choice of solver. First, the deterministic model (extensive variables, with immigration and emigration – sorry for my partially “private” notation):
image
When neglecting immigration and emigration and using data from Martcheva’s 2015 Springer book (An Introduction to Mathematical Epidemiology), the deterministic model gives:
image

Assuming Poisson distribution for the events leading to infection and recovery, Keeling and Rohani’s description gives the Stochastic Differential Equation:

Here, f() is the drift/“deterministic” term while G() is the diffusion/“stochastic” term, while dw holds two independent Wiener processes:

# Drift function for SIR model
function sir_drift!(f,x,p,t)
    S,I,R = x
    tau_i,tau_r,N = p
    #
    f[1] = -I*S/tau_i/N
    f[2] = I*S/tau_i/N - I/tau_r
    f[3] = I/tau_r
end
# Diffusion model for SIR model
function sir_diffuse!(G,x,p,t)
    S,I,R = x
    tau_i,tau_r,N = p
    #
    G[1,1] = -sqrt(I*S/tau_i/N)
    G[1,2] = 0.
    G[2,1] = sqrt(I*S/tau_i/N)
    G[2,2] = - sqrt(I/tau_r)
    G[3,1] = 0.
    G[3,2] = sqrt(I/tau_r)
end

Next, I define the problem; elements of the description:

#
prob = SDEProblem(sir_drift!,sir_diffuse!,x3,tspan,p,noise_rate_prototype=zeros(3,2))
ensembleprob = EnsembleProblem(prob)
sol = solve(ensembleprob,EnsembleThreads(),trajectories=100)

I can compute some statistics, and find:

image

Question 1: I have specified the noise as noise_rate_prototype=zeros(3,2). Is this correct for saying that there are two independent Wiener processes for the 3 states? Does it matter whether I use zeros(3,2) or ones(3,2) or something else, or is the important thing that the matrix has correct dimension? [And why not just say that I have a vector of Wiener processes with 2 elements??]

Question 2: I have not specified the solver. I tried to specify solver SRWI1, but that gave an error message.

1 Like

It just needs to have the correct type. The reason why we want the matrix, the prototype, is that this allows structures like sparse, Tridiagonal, etc. for specializing the computation.

SRIW1 cannot handle this one. LambaEM is probably the right call here, and probably what it uses.

1 Like

Thanks! If I understand you correctly, if I set noise_rate_prototype = A where A = sparse([1 0;1 1;0 1]) (in my case), the solver can take advantage of the structure of G(). [In my simple case, there is probably little to win by that.]

1 Like

One more question on manual computing of statistics… it is “trivial” to compute the mean, the median, and the std, e.g.:

sol_a = Array(sol)
sol_mean = mean(sol_a; dims=3)[:,:,1] |> permutedims
sol_median = median(sol_a; dims=3)[:,:,1] |> permutedims
sol_std = std(sol_a; dims=3)[:,:,1] |> permutedims
plot(sol_mean)
plot!(sol_median)
plot!(sol_mean,ribbon=(sol_std,sol_std))

However – for some reason, function quantile does not seem to allow for the dims keyword.

Question: Does this mean that quantile only accepts vectors as first argument, and that I need to loop through both the number of states (first dimension of sol, = 3 in my case) and the number of time points (second dimension of sol, = 111 in my case)? In other words, something like the following:

[quantile(sol_a[i,j,:],0.5) for j in 1:111, i in 1:3]

in my case of 3 states, 111 time points, and the median (0.5 quantile)? Or is there a built-in method that handles dims?

Ensembles and reliability in statistics… I seem to recall that Bradley Efron recommends the following in his book on Bootstrapping [or maybe I read it somewhere else…]:

  • For reliable computation of mean and variance, 50-100 trajectories are needed
  • For reliable computation of quantiles, etc., 1000+ trajectories are needed

Are these good rules of thumb in general for statistics based on an ensemble of realizations?

Try using Array(sol_a)?

Depends on the amount of nonlinearity and the dt. But as a rule of thumb, 10,000 trajectories is generally enough to saturate convergence for mean effects, and you need a lot higher for variance and quantiles to be very accurate from what I’ve seen.

With sol being the solution of the Ensemble problem:

It works exactly the same way if I replace mean with median, std, etc.

But for quantile (where I need an extra parameter), the following does not work (with, e.g., the 0.3 quantile):

I wrote my own function for doing what I want, but it is not perfect – it only works for 2 and 3 dim matrices (I think), and there is no proper testing of arguments.

function quantile_my(A,q=0.5;dims=1)
    sA = size(A)
    nA = length(sA)
    iA = setdiff(1:nA,dims)
    if nA == 2
        return permutedims([quantile(A[i,:],q) for i in 1:sA[iA[1]]])
    elseif nA == 3
        return permutedims([quantile(A[i,j,:],q) for j in 1:sA[iA[2]], i in 1:sA[iA[1]]])
    else
        println("Doesn't work for your array dimension")
    end
end

It would be nice if quantile were “symmetric” in the sense of offering the same input array as the other statistical functions. Maybe I’m just overlooking the documentation.

That seems like an issue in Julia’s Base then.