Performance of large sparse discrete map problems in DifferentialEquations

Hello,

I normally deal with dynamics on networks, so I have giant sparse systems of coupled ODE’s. The problem I have at the moment is that for a simple discrete map, a naive implementation is significantly faster than solving the DiscreteProblem provided by DIfferentialEquations.jl. For example, the code below is a MWE of what my problems look like,

using DifferentialEquations
using Random
using SparseArrays

# parametes
rng = MersenneTwister(0)
N = 80000
M = 150000
T = 10
β = 0.4
A = sprand(rng, N, M, 0.0001, (r, x) -> rand(r, [1], x))
p = (β, A)
###

function map!(unp1, un, p, t)
    β, A = p
    unp1 .= β .* un .* A
end

function naive_test(u0, tspan, p)
    unp1 = similar(u0)
    for i in tspan[1]:tspan[2]
        map!(unp1, u0, p, i)
    end
    return unp1
end

# test
u0 = sprand(N, M, 0.0001)
prob = DiscreteProblem(map!, u0, (0, T), p)

@time solve(prob; save_on=false);

@time naive_test(u0, (0, T), p);

For the solve function I turn off saving of intermediate results to better match the naive test and I get the following,

julia> @time solve(prob; save_on=false);
205.226954 seconds (16.25 M allocations: 1.241 GiB, 0.10% gc time, 1.79% compilation time)

and for the naive_test implementation I get

julia> @time naive_test(u0, (0, T), p);
  0.142919 seconds (16.55 k allocations: 20.360 MiB, 7.48% compilation time)

I was wondering if I’m missing something and this sort of performance is expected, I know there is a significant amount of machinery behind diffeq and it might not suit sparse arrays. Any help or explanation would be appreciated.

Using DifferentialEquations v7.1.0 and julia 1.7.1.

Rory.

2 Likes

Reducing the problem size and testing it like so

using LinearAlgebra
using SimpleDiffEq
using DifferentialEquations
using OrdinaryDiffEq
using Random
using SparseArrays
using BenchmarkTools

# parametes
const rng = MersenneTwister(0)
const N = 8000
const M = 15000
const T = 10
const β = 0.4
const A = sprand(rng, N, M, 0.0001, (r, x) -> rand(r, [1], x))
const p = (β, A)
###

function map!(unp1, un, p, t)
    β, A = p
    unp1 .= β .* un .* A
end

function naive_test(u0, tspan, p)
    unp1 = similar(u0)
    for i in tspan[1]:tspan[2]
        map!(unp1, u0, p, i)
    end
    return unp1
end

# test
const u0 = sprand(N, M, 0.0001)
const prob = DiscreteProblem(map!, u0, (0, T), p)

y1 = solve(prob, FunctionMap(); save_on=false);
y2 = solve(prob, SimpleFunctionMap());
println(norm(y2[end] - y1[end]))
y3 = naive_test(u0, (0, T), p);
println(norm(y3 - y2[end]))

@btime solve(prob, FunctionMap(); save_on=false);
@btime solve(prob, SimpleFunctionMap());
@btime naive_test(u0, (0, T), p);
println()

yields something like this for me

0.0
0.40618462634054964
  2.144 s (58 allocations: 1.72 MiB)
  2.158 ms (67 allocations: 3.28 MiB)
  3.318 ms (6 allocations: 305.64 KiB)

I’m wondering about the deviation in results and if the performance difference of FunctionMap and SimpleFunctionMap is justified in this case?

Edit:

  • The deviation in results is probably due to a different number of rounds
  • Runtime in FunctionMap is dominated by _any(::typeof(DiffEqBase.NAN_CHECK), ::SparseMatrixCSC{Float64, Int64}, ::Colon), which seems like a bad idea…

Edit: Here is a flame graph

2 Likes

Well NaN check in and of itself is a good idea: you want to look for NaNs and if you see any then you want to stop calculating and throw the user a warning. What seems to be the issue is that any(isnan,A::SparseMatrixCSC) seems like it’s looping over all values, not all non-zero values, which for a very sparse matrix would take considerably longer than any other specialized calculation. We should specialize this in SciMLBase. Could you open an issue there?

2 Likes

Try this PR branch: https://github.com/SciML/DiffEqBase.jl/pull/732

2 Likes

Nice, this does the trick for me:

  2.081 ms (58 allocations: 1.72 MiB)
  2.172 ms (67 allocations: 3.27 MiB)
  3.266 ms (6 allocations: 304.77 KiB)

Just a remark:

@inline NAN_CHECK(x::SparseArrays.AbstractSparseMatrixCSC) = any(NAN_CHECK, nonzeros(x))

seems to work, too.

1 Like

Thanks for fixing the error, I edited my original post so it should at least run. I should have run it in a clean repl before posting.

Just tested this branch and it works perfectly, I’m getting the same results as @goerch. Thanks a million, I have to say Julia’s biggest strength is it’s community :grin:

Yeah, that’s probably better.

Merged.