StaticArrays + ArrayPartition + DiffEq = Allocations?

graphitical · February 7, 2023, 5:40pm

I’m attempting to convert the DiffEq example using ArrayPartition to also use StaticArrays to understand how to feed in an ArrayPartition of StaticArrays for my own problem. This is meant to be a toy example to help me understand things a bit better.

Also, using the info on feeding StaticArrays into the problem for code optimization here.

Problem: My f function does not allocate (this is good!) and there aren’t any type instabilities when running @code_warntype but when I time the actual solve there are a ton of allocations. This seems counter to what the page on code optimization indicates.

Question: What is going on here? Am I calling @btime incorrectly? Is there something else I need to do in f to prevent allocations?

MWE:

using Unitful, RecursiveArrayTools, OrdinaryDiffEq
using LinearAlgebra
using StaticArrays

r0 = SA[1131.340, -2282.343, 6672.423]u"km"
v0 = SA[-5.64305, 4.30333, 2.42879]u"km/s"
Δt = 86400.0*365u"s"
μ = 398600.4418u"km^3/s^2"
rv0 = ArrayPartition(r0,v0)

function f(y, μ, t)
    r = norm(y.x[1])
    dy1 = y.x[2]
    dy2 = -μ * y.x[1] / r^3
    ArrayPartition(dy1, dy2)
end


prob = ODEProblem(f, rv0, (0.0u"s", Δt), μ)

using BenchmarkTools
@btime f($rv0, $μ, 0.0) # ~11ns, 0 allocations
alg = Vern8()
save_everystep = false
@btime solve($prob, $alg, save_everystep=$save_everystep) # ~25ms, ~70k allocations

ChrisRackauckas · February 7, 2023, 7:33pm

I’d check this on v1.9 because the effects analysis improved and this may just be something where the compiler didn’t optimize it out on a given version.

graphitical · February 7, 2023, 9:14pm

Thank Chris,

I took your advice and downloaded the 1.9.0-beta3. What I see is that the total compute time is roughly flat with 1.8.5, but the allocations did significantly decrease.

Does this mean the code is butting up against the speed of the actual calculation? This seems pretty slow for what appears to be a simple solve, but there also appears to be a history of people complaining about the efficiency of norm. IDK if that’s still relevant though.

Timing
Running identical code from above in two different environments:

Version 1.8.5


@btime f($rv0, $μ, 0.0)
  7.800 ns (0 allocations: 0 bytes)

@btime solve($prob, $alg, save_everystep=$save_everystep)
  16.090 ms (70550 allocations: 3.24 MiB)

Version 1.9.0-beta3

 @btime f($rv0, $μ, 0.0)
  7.500 ns (0 allocations: 0 bytes)

 @btime solve($prob, $alg, save_everystep=$save_everystep)
  15.072 ms (121 allocations: 11.98 KiB)

I just used add packagname manually for adding everything so I’m not sure if each environment has the exact same package versions and I’m not sure how to check easily. IDK if that’s important, but I thought I’d add it.

ChrisRackauckas · February 8, 2023, 2:11am

Yup, this looks about as expected.

That was fixed up.

What does the profile say? Share a flame graph. If it’s mostly in the parts that aren’t allocating then the allocations are not the issues. Allocations can even improve performance in some cases.

For the lowest overhead case, try the Vern implementations in SimpleDiffEq.jl. If it’s a dead simple ODE like this, then those should have essentially zero overhead since they are just the loop. GPUVern7 and GPUVern9.

The big thing to ask is whether the Verner methods are the right ones for the job here. At the tolerances you’re choosing, the answer is probably no.

Topic		Replies	Views
Why does using StaticArrays causes more allocations? General Usage question , package , ordinarydiffeq	1	229	April 4, 2024
Inexplicable allocations when summing `StaticArrays` Performance	8	1127	December 3, 2018
Reduce allocations for matrix differential equations function Performance	10	1339	February 8, 2019
A lot of memory allocations which I do not understand Performance	2	551	March 23, 2020
Avoiding memory allocation when repeatedly using small arrays Performance	5	494	July 25, 2021

StaticArrays + ArrayPartition + DiffEq = Allocations?

Related topics