Speeding up DDE that interpolates past value at 3 indices

On the example for DDE, there is a section talking about speeding up interpolation at a specific index. However, I’m wondering if I can do similar things with more than one, like 3, indices. I have a large model with 150 variables, but I only need past values at 3 indices, so it’s a waste of time to interpolate all 150 at every timestep.

Additionally, DDE solver seems to have too much allocation and is very slow. I’m wondering if there’s anyway to just pre-allocate a cache to store the historic variables. I don’t see why it allocates more than what save_everystep=true does.

P.S.: I do need these 3 variables at 3 different times in the past (t-0.75, t-1.0, t-1.5), if this is relevant.

Show your code? Are you interpolating with idxs?

I tried to interpolate with idxs, but the confusing part is it generated even more allocations with idxs than without.
Here’re two MWEs (one implemented with idxs and the other with in-place history function):

#1
using DifferentialEquations;
using BenchmarkTools;

function computeNFkBFluxes!(nettFlux, concentration, delay, (birthday), time)
    @inbounds begin
        p = (birthday);
        hist_1_075 = delay(p, time-0.75; idxs=1);
        hist_2_075 = delay(p, time-0.75; idxs=2);
        hist_3_075 = delay(p, time-0.75; idxs=3);

        hist_2_1 = delay(p, time-1.0; idxs=2);
        hist_3_1 = delay(p, time-1.0; idxs=3);

        hist_1_15 = delay(p, time-1.5; idxs=1);
        hist_2_15 = delay(p, time-1.5; idxs=2);

        nettFlux[1] = (1/(1+1*(hist_2_075^2))) * (0.2 - 0.3)*concentration[1] - 0.5*concentration[1];
        nettFlux[2] = (1/(1+1*(hist_2_1^2))) * (1 - 0.2 + 0.3)*concentration[1] +
                (1/(1+1*(hist_1_15^2))) * (0.2 - 0.3)*concentration[2] - 1*concentration[2];
        nettFlux[3] = (1/(1+1*(hist_2_15^2))) * (1 - 0.2 + 0.3)*concentration[2] - 1*concentration[3];
        nettFlux[4] = (1/(1+1*(hist_1_075^2))) * (1 - 0.2 + 0.3)*concentration[3] - 1*concentration[4];
        nettFlux[5] = (1/(1+1*(hist_3_075^2))) * (1 - 0.2 + 0.3)*concentration[4] - 1*concentration[5];
        nettFlux[6] = (1/(1+1*(hist_3_1^2))) * (1 - 0.2 + 0.3)*concentration[5] - 1*concentration[6];
    end
end

const delay(p, t; idxs=nothing) = typeof(idxs) <: Number ? 0.0 : ones(6);
Bcell = DDEProblem(computeNFkBFluxes!, [1.0,1.0,1.0,1.0,1.0,1.0], delay, (0.0, 120.0), (0); constant_lags=[0.75, 1.0, 1.5]);

@btime solution = solve(Bcell, MethodOfSteps(Tsit5()), save_everystep=false, maxiters=1e10);

#2
function computeNFkBFluxes!(nettFlux, concentration, delay, (historicFlux), time)
    @inbounds begin
        # get historic values for nuclear NFkB concentrations (for delayed transcription)
        p = (historicFlux);
        delay(historicFlux, p, time-0.75);
        hist_1_075 = historicFlux[1];
        hist_2_075 = historicFlux[2];
        hist_3_075 = historicFlux[3];

        delay(historicFlux, p, time-1.0);
        hist_2_1 = historicFlux[2];
        hist_3_1 = historicFlux[3];

        delay(historicFlux, p, time-1.5);
        hist_1_15 = historicFlux[1];
        hist_2_15 = historicFlux[2];

        nettFlux[1] = (1/(1+1*(hist_2_075^2))) * (0.2 - 0.3)*concentration[1] - 0.5*concentration[1];
        nettFlux[2] = (1/(1+1*(hist_2_1^2))) * (1 - 0.2 + 0.3)*concentration[1] +
                (1/(1+1*(hist_1_15^2))) * (0.2 - 0.3)*concentration[2] - 1*concentration[2];
        nettFlux[3] = (1/(1+1*(hist_2_15^2))) * (1 - 0.2 + 0.3)*concentration[2] - 1*concentration[3];
        nettFlux[4] = (1/(1+1*(hist_1_075^2))) * (1 - 0.2 + 0.3)*concentration[3] - 1*concentration[4];
        nettFlux[5] = (1/(1+1*(hist_3_075^2))) * (1 - 0.2 + 0.3)*concentration[4] - 1*concentration[5];
        nettFlux[6] = (1/(1+1*(hist_3_1^2))) * (1 - 0.2 + 0.3)*concentration[5] - 1*concentration[6];
    end
end
historicFlux = ones(Float64, 6);
const delay(historicFlux, p, t) = (historicFlux .= 1.0);
Bcell = DDEProblem(computeNFkBFluxes!, [1.0,1.0,1.0,1.0,1.0,1.0], delay, (0.0, 120.0), (historicFlux); constant_lags=[0.75, 1.0, 1.5]);

@btime solution = solve(Bcell, MethodOfSteps(Tsit5()), save_everystep=false, maxiters=1e10);

The 1st example runs in 1.778 ms (76287 allocations: 1.29 MiB), and the 2nd example runs in 326.035 μs (7462 allocations: 1.39 MiB). I thought the idxs implementation is supposed to save me some allocations :sweat_smile:, but it’s 10 times more. Also, if I run the first example with save_everystep=true, then it runs in 1.813 ms (77228 allocations: 1.41 MiB), which is 941 more allocations than save_everystep=false, so I’m assuming that saving every step of the solution only takes 1/100 of what the interpolation function in the DDE solver takes, so I might as well just pre-allocate a cache to store every steps?

I think this is hitting https://github.com/JuliaLang/julia/issues/35800 . Could you open an issue on DelayDiffEq to track this?

1 Like

Posted here: https://github.com/SciML/DelayDiffEq.jl/issues/218

Thanks! Sorry that’s not the most satisfying answer, but now we have an example to track and get it fixed, though this might take a longer time to fix given that it seems to intersect with a pretty deep Julia bug.

Thanks for tracking down the problem! I was also the one who raised the weird vectorcontinuouscallback allocations problem, so I will wait for both fixes. Given your understanding of the DDE solver algorithm, do you have an expectation of how many allocations the DDE solver (using idxs in the history function) would have after the bug fix? Not the exact number, obviously, I’m just wondering if there is supposed to be 0 allocation, or some.

A few thousand is probably the right ballpark