New scientific computing benchmark - Network of differential equations

For our recent paper on NetworkDynamics.jl, we developed a benchmark problem. The goal is to solve a Kuramoto system on a sparse random graph, i.e.

\dot \theta_i = \omega_i + \sigma \sum_{j=1}^N sin(\theta_j - \theta_i)

In the concrete implementations we tried to strike a balance between usability (writing code should feel “simple”) and performance. We compare two implementations in Python (SciPy and JiTCODE, the later JIT-compiles a C function for the ODE), two in Julia (NetworkDynamics.jl and incidence matrix multiplication with SparseArrays.jl) and one in Fortran (similar approach as for SparseArrays.jl).

The JITCODE (and SciPy) version is derived from this publication and does not make use of the symmetry of the sine function sin(y-x) = -sin(x-y), which means that the right hand side function evaluates roughly twice as many floating point operations.

All programs employ Dormand-Prince’s 5th order Runge-Kutta method. For Julia we use DifferentialEquations.jl to solve the ODE.

Work precision diagram of a network with 10 Kuramoto oscillators.
Work precision diagram of a network with 100 Kuramoto oscillators.
Work precision diagram of a network with 1000 Kuramoto oscillators.

The Julia programs finished considerably earlier than their competitors and most of the time faster than the program written in Fortran, with Fortran catching up for the larger system. Notably, NetworkDynamics is as fast as SparseArray multiplication indicating that the convenient abstractions that facilitate model building are essentially for free.

When startup and JIT-compile time is taking into account for all runtimes we see a different picture. The following plot shows the time until the first trajectory is solved.

System size vs. total time

Fortran isn’t shown since that program doesn’t have a JIT compile stage. In fact the whole Fortan benchmark (300 integrations) finishes roughly in the time it takes to startup a Julia session and import all required libraries (Julia 1.5.1). All code used to generate these plots can be found HERE. If you know how to improve the benchmarks or want to add your favourite network dynamics solution in yet another language, we are happy about contributions. Just open a pull request on GitHub.

16 Likes

Cool work! I’ll link to it from the SciMLBenchmarks.

The setup time, is it mostly symbolic or LLVM compilation? If it’s symbolic time then you might want to try with Specialize `Base.isequal` by YingboMa · Pull Request #254 · JuliaSymbolics/SymbolicUtils.jl · GitHub . If you can get us a nice way to time that we can take a deeper look.

1 Like

We’re not doing symbolic calculations in the Julia scripts (only JITCODE.py does that). The setup time includes the loading of all required packages (using OrdinaryDiffEq, ...) so that’s expected to be quite high.

Try running on 1.6 the loading should be multiple times faster.

There seems to be a lot of global variables, and eval and stuff happening outside of functions.

I’m a bit perplexed by the programming style, but are you sure this is efficient and in keeping with the performance tips ?

Also, the benchmarks are done with the CPUTime.jl library, with which I’m not familiar, but seems quite old, and comes with a warning to “only use this package if you know what you’re doing”. The canonical benchmarking package in Julia is BenchmarkTools.jl. What is the rationale behind this choice?

5 Likes

You’re right, that should be faster in 1.6. However, we did the benchmarks a few weeks ago and the new version was not yet released. Eventually we will repeat them.

Thank you for you comments. I agree that the scripts look unnecessarily complicated but that’s because we’re benchmarking in different languages. CPUtime seems to be more suited to that, since it should easily comparable across languages and we wanted to time larger chuncks of code like the loading of modules and the execution of several preparatory functions in succession. On the other hand we decided to build an sqlite database that can be written to from each language, and much of the code is concerned with this kind of input / output handling. Performance critical variables are defined as const, so they shouldn’t cause problem, see e.g. here.

	g = SimpleGraph(my_matrix')
	const B = incidence_matrix(g, oriented = true)
	const B_t = transpose(B)

	function kuramoto_network!(dx, x, p, t)
		dx .= p .-  coupling_constant .* (B * sin.(B_t * x))
		return nothing
	end

I see. I can’t really tell if this way of coding the benchmarks will have a negative impact, but perhaps someone more experienced with benchmarking packages could comment.

It also matches the more straightforward, but less systematic stuff using BenchmarkTools (which is of course also our usual go to for benchmarking).

Also, Julia wouldn’t be beating Fortran if there were such problems here.