I am solving an ODE with my own version of a Dormand Prince solver, and previously I have been using an in-place version, which has worked well for me, in that the allocations come from allocating for the array containing the result. Now, I’ve been trying to rewrite this in-place solver using Static Arrays, and for the most part, this implementation seems to be working ok.
using BenchmarkTools
using StaticArrays
using LinearAlgebra
function dpsolveinplace(x0,f!,dt,n::Int;final=0,printout::Bool=true)
#=
Dormand-Prince solver for a differential equation
of the form x' = f(x)
initial condition x0
time step dt
number of time steps n
final = true means only return the final value
printout tells if you want to print out the progress
=#
N = length(x0)
b = dpmat()
kmat = zeros(typeof(x0[1]),7,N)
ubuffer = zeros(typeof(x0[1]),N)
fbuffer = zeros(typeof(x0[1]),N)
res = zeros(typeof(x0[1]),n+1,N)
res[1,:] .= x0
a = 0
for count in 1:n
f!(fbuffer,@view res[count,:])
kmat[1,:] .= fbuffer
for j in 2:7
for k in 1:N
ubuffer[k] = 0.0
for i in 1:j-1
ubuffer[k] += b[i,j-1]*kmat[i,k]
end
ubuffer[k] = dt*ubuffer[k] + res[count,k]
end
f!(fbuffer,ubuffer)
for k in 1:N
kmat[j,k] = fbuffer[k]
end
end
for k in 1:N
res[count+1,k] = res[count,k] + dt*sum(b[i,6]*kmat[i,k] for i in 1:6)
end
if mod(count,convert(Int,round(n/20))) == 0 && printout
println("Progress = ",round(100*count/n,digits=2),"%")
end
end
if typeof(final) == Bool
return @view res[end,:]
else
return res
end
end
function dpsolve_static(x0,f,dt,n::Int;final=0,printout::Bool=true)
#=
Dormand-Prince solver for a differential equation
of the form x' = f(x)
Uses static arrays
Also uses an in-place function to mutate the static array
initial condition x0
time step dt
number of time steps n
Uses SVectors
final = true means only return the final value
=#
N = length(x0)
b = dpmat()
ubuffer = @MVector zeros(eltype(x0),N)
buffer = copy(ubuffer)
sumbuffer = copy(ubuffer)
res = Array{SVector{N,eltype(x0)}}(undef,n+1)
kmat = Array{SVector{N,eltype(x0)}}(undef,7)
for i in 1:7
kmat[i] = zeros(eltype(x0),N)
end
res[1] = copy(x0)
a = 0
for count in 1:n
kmat[1] = f(res[count])
for j in 2:7
for k in 1:N
ubuffer[k] = 0
for i in 1:j-1
ubuffer[k] += b[i,j-1]*kmat[i][k]
end
end
buffer = res[count] + dt*ubuffer
kmat[j] = f(buffer)
end
sumbuffer .= 0.0
for i in 1:6
for k in 1:N
sumbuffer[k] += b[i,6]*kmat[i][k]
end
end
for k in 1:N
sumbuffer[k] = sumbuffer[k]*dt + res[count][k]
end
res[count+1] = SVector(sumbuffer)
if mod(count,convert(Int,round(n/20))) == 0 && printout
println("Progress = ",round(100*count/n,digits=2),"%")
end
end
a > 0 && println(a)
if typeof(final) == Bool
return @view res[end,:]
else
return res
end
end
function dpmat() #Dormand Prince matrix
b = @MMatrix zeros(6,6)
b[1,1] = 1/5
b[1,2] = 3/40
b[2,2] = 9/40
b[1,3] = 44/45
b[2,3] = -56/15
b[3,3] = 32/9
b[1,4] = 19372/6561
b[2,4] = -25360/2187
b[3,4] = 64448/6561
b[4,4] = -212/729
b[1,5] = 9017/3168
b[2,5] = -355/33
b[3,5] = 46732/5247
b[4,5] = 49/176
b[5,5] = -5103/18656
b[1,6] = 35/384
b[2,6] = 0
b[3,6] = 500/1113
b[4,6] = 125/192
b[5,6] = -2187/6784
b[6,6] = 11/84
return b
end
where I have chosen to define the Dormand Prince coefficient matrix dpmat() as a MMatrix because the syntax to define it as a SMatrix immediately would be hideous.
I wrote some 2D test functions to check my implementations.
function testfun_2d(x)
x1 = x[1] + x[2]
x2 = x[1] - x[2]
return @SVector[x1,x2]
end
function testfun_2d!(u,x)
u[1] = x[1] + x[2]
u[2] = x[1] - x[2]
end
@btime dpsolveinplace([1.0,1.0],testfun_2d!,1e-3,1000,final=true,printout=false)
138.900 μs (6 allocations: 16.58 KiB)
2-element view(::Matrix{Float64}, 1001, :) with eltype Float64:
4.914781300625746
2.1781835566085666
@btime dpsolve_static(@SVector[1.0,1.0],testfun_2d,1e-3,1000,final=true,printout=false)
39.600 μs (12 allocations: 17.05 KiB)
1-element view(::Matrix{SVector{2, Float64}}, 1001, :) with eltype SVector{2, Float64}:
[4.914781300625746, 2.1781835566085666]
I am a little unsure why the static array method allocates slightly more memory, but it’s less than 1 kB so that’s not my query.
I have a question with creating a Static Array where I need to run a loop to determine each entry. The array is then passed to an ODE solver. Here is an in-place example of what I am using.
function static_test!(u,x;N=6)
u[1] = 2
u[2] = 3
for i in 1:N
u[i+2] = i
end
u[9] = 1
u[10] = 2
end
Suppose, in principle, that I don’t know (at compilation) the value of N. However, my function will eventually calculate N before solving the ODE. In this case, how should I generate my Static Array? The following, for instance, is fairly slow.
function static_test(x)
return SVector{10,Float64}([2;3;collect(1:6);1;2])
end
where I have chosen N = 6. The two solves return (leaving out the return value)
@btime dpsolve_static(@SVector[1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0],static_test,1e-3,1000,final=0,printout=false)`
24.788 ms (335012 allocations: 10.56 MiB)
@btime dpsolveinplace(ones(10),static_test!,1e-3,1000,final=0,printout=false)
446.700 μs (7 allocations: 79.75 KiB)
Why is there such a large difference in the memory allocations now? Is there a better way to define static_test
, other than hard-coding @SVector[2,3,1,2,3,4,5,6,1,2]
, which isn’t generally an option?