# Time Dependent Linear Map

I am trying to implement an initial problem. Let the system of ODEs be
\frac{d\vec{y}}{dt} = A(t;\vec{\theta})\vec{y}, \quad \vec{y}(0) = \vec{y}_0, \quad 0 \leq t \leq T.

The matrix A depends on the time t, as well a vector of control parameters \vec{\theta}.

The time-stepping method I’m using is similar to the trapezoidal rule. The timestepping is done by something like
(I - A(t_{n+1}, \vec\theta))\vec y_{n+1} = (I + A(t_n))\vec y_n

For reasons particular to my use-case, I would like the linear solves in my timestepping method to be iteratively using gmres! from the IterativeSolvers.jl package. What’s more, I would like to compute d\vec{y}/dt without actually computing the matrix A. I have a function apply_A!(dydt, y, t, theta), which compute the derivative of y and stores it in dydt.

I have been trying to do this using LinearMaps.jl, in a way like this:

using LinearMaps
using IterativeSolvers

#=
The problem being solved is

y' = -2t*y
y(t) = y₀*exp(-t²)

The scalar equation is done for each element of y, so you can solve the initial
value problem for multiple initial conditions.
=#

function apply_A!(dydt, y_in, t)
dydt .= (-2*t) .* y_in
return nothing
end

function solver(y0, dt, nsteps)
y = copy(y0)
dydt = zeros(length(y))

t = 0.0

right_hand_side = zeros(length(y))

function compute_left_hand_side!(out_vec, y_in)
apply_A!(dydt, y_in, t)
out_vec .= y_in
out_vec .-= 0.5*dt .* dydt
return nothing
end

LHS_map = LinearMaps.LinearMap(
compute_left_hand_side!,
length(y), length(y),
ismutating=true
)

for n in 0:(nsteps-1)
t = n*dt
apply_A!(dydt, y, t)
right_hand_side .= y
right_hand_side .+= 0.5*dt .* dydt

#display(y)

t = (n+1)*dt
gmres!(y, LHS_map, right_hand_side)
end

println("Final t = ", t)
println("Final y = ", y)

return y
end


I have something like this implemented, but my method is pretty slow, makes many more memory allocations than are reasonable, and I have tracked down that the vast majority of the memory allocations do not come from any of the code I have written (i.e. they are from gmres or the linear map, in the way that I have used them).

I am not very experienced with debugging type-allocation, but it seemed to me from the beginning that this might be a bad idea, since I create a function inside a function, which I then pass to gmres!. And so the function gmres! takes may not be known until runtime, messing with the compilation process. Also, compute_left_hand_side relies on variables outside the scope of that function. Although those variables are technically not global variables, which I know people generally advise against using, I fear it may be similarly bad for performance to use variables outside the function scope like I have here.

Does anyone have a “right” way to do this? I feel like I am working against the intended use of these packages, but I can’t think of another way to accomplish this. If there are packages besides LinearMaps and IterativeSolvers which would be more appropriate for this use-case, I am happy to switch.

# EDIT

Changed code mock-up to a MWE (and one that is actually algorithmically correct).

Generally, a runnable example makes these things much easier for others to debug

That being said, I am pretty sure you run into performance issues due to captured variables. More concretely, you define the closure compute_left_hand_side! that captures t, theta and dydt. Out of these theta appears to be constant (is never assigned to) but t and dydt certainly are assigned to and thus will cause issues. A quick fix would be to convert t and dydt into Refs such that the content of the variable is constant (the Ref) and you just mutate what it points to:

function solver(y0, theta, t0, dt, nsteps)
y = copy(y0)
dydt = Ref(zeros(length(y)))

t = Ref(0.0)

compute_left_hand_side! = let t=t, dydt = dydt, theta=theta # convert to locals to help the compiler, probably unnecessary though
function compute_left_hand_side!(out_vec, y_in)
apply_A!(dydt[], y_in, t[], theta)
out_vec .= y_in
dydt[] .+= dydt[] # what does this line actually do? Are you sure it is correct?
return nothing
end
end

LHS_map = LinearMaps.LinearMap(
compute_left_hand_side!,
length(y), length(y),
ismutating=true
)

for n in 0:nsteps
t[] = n*dt
apply_A!(dydt[], y, t_n, theta) # should t_n be t?
right_hand_side = y + dydt[]

t[] = (n+1)*dt
gmres!(y, LHS_map, right_hand_side)
end

return y
end


In general you could perhaps rethink the code a bit and extract the locals that are constantly overwritten to a struct that when called uses its fields to compute_left_hand_side!. That could also be simpler to understand.

1 Like

Thanks! I did not know about references or callable structs. I will try doing it with callable structs and report back how that affects performance (your suggestion of using references also makes sense to me, but I am terrified that I will screw something up if I do that in my actual use case).

I have implemented the original solver, another based on callable structs, another based on references, and another based on references but without the ‘let’ statement.

using LinearMaps
using IterativeSolvers

#=
The problem being solved is

y' = -2t*y
y(t) = y₀*exp(-t²)

The scalar equation is done for each element of y, so you can solve the initial
value problem for multiple initial conditions.
=#

function apply_A!(dydt, y_in, t)
dydt .= (-2*t) .* y_in
return nothing
end

function solver(y0, dt, nsteps)
y = copy(y0)
dydt = zeros(length(y))

t = 0.0

right_hand_side = zeros(length(y))

function compute_left_hand_side!(out_vec, y_in)
apply_A!(dydt, y_in, t)
out_vec .= y_in
out_vec .-= 0.5*dt .* dydt
return nothing
end

LHS_map = LinearMaps.LinearMap(
compute_left_hand_side!,
length(y), length(y),
ismutating=true
)

for n in 0:(nsteps-1)
t = n*dt
apply_A!(dydt, y, t)
right_hand_side .= y
right_hand_side .+= 0.5*dt .* dydt

#display(y)

t = (n+1)*dt
gmres!(y, LHS_map, right_hand_side)
end

return y
end

# Callable struct version
mutable struct Astruct
t::Float64
dt::Float64
dydt::Vector{Float64}
function Astruct(t::Float64, dt::Float64, N::Int64)
dydt = zeros(N)
new(t, dt, dydt)
end
end

function (a_struct::Astruct)(out_vec, y_in)
apply_A!(a_struct.dydt, y_in, a_struct.t)
out_vec .= y_in
out_vec .-= 0.5*a_struct.dt .* a_struct.dydt
return nothing
end

function struct_solver(y0, dt, nsteps)
y = copy(y0)
dydt = zeros(length(y))

right_hand_side = zeros(length(y))

t0 = 0.0
a_struct = Astruct(t0, dt, length(y))

LHS_map = LinearMaps.LinearMap(
a_struct,
length(y), length(y),
ismutating=true
)

for n in 0:(nsteps-1)
t = n*dt
apply_A!(dydt, y, t)
right_hand_side .= y
right_hand_side .+= 0.5*dt .* dydt

a_struct.t = (n+1)*dt
gmres!(y, LHS_map, right_hand_side)
end

return y
end

function ref_solver(y0, dt, nsteps)
y = copy(y0)
dydt = Ref(zeros(length(y)))
local_dt = Ref(dt)

t = Ref(0.0)

right_hand_side = zeros(length(y))

compute_left_hand_side! = let t=t, dydt = dydt, local_dt = local_dt # convert to locals to help the compiler, probably unnecessary though
function compute_left_hand_side!(out_vec, y_in)
apply_A!(dydt[], y_in, t[])
out_vec .= y_in
dydt[] .-= 0.5*dt[] .* dydt[]
return nothing
end
end

LHS_map = LinearMaps.LinearMap(
compute_left_hand_side!,
length(y), length(y),
ismutating=true
)

for n in 0:(nsteps-1)
t[] = n*dt
apply_A!(dydt[], y, t[])
right_hand_side .= y
right_hand_side .+= 0.5*dt .* dydt[]

t[] = (n+1)*local_dt[]
gmres!(y, LHS_map, right_hand_side)
end

return y
end

function ref_solver_no_local(y0, dt, nsteps)
y = copy(y0)
dydt = Ref(zeros(length(y)))
local_dt = Ref(dt)

t = Ref(0.0)

right_hand_side = zeros(length(y))

function compute_left_hand_side!(out_vec, y_in)
apply_A!(dydt[], y_in, t[])
out_vec .= y_in
dydt[] .-= 0.5*dt[] .* dydt[]
return nothing
end

LHS_map = LinearMaps.LinearMap(
compute_left_hand_side!,
length(y), length(y),
ismutating=true
)

for n in 0:(nsteps-1)
t[] = n*dt
apply_A!(dydt[], y, t[])
right_hand_side .= y
right_hand_side .+= 0.5*dt .* dydt[]

t[] = (n+1)*local_dt[]
gmres!(y, LHS_map, right_hand_side)
end

return y
end


julia> include("examples/timestepping_structure.jl")
ref_solver_no_local (generic function with 1 method)

julia> using BenchmarkTools

julia> y0 = ones(3)
3-element Vector{Float64}:
1.0
1.0
1.0

julia> @btime solver(y0, 0.1, 10);
13.587 μs (205 allocations: 16.50 KiB)

julia> @btime struct_solver(y0, 0.1, 10);
12.897 μs (175 allocations: 15.30 KiB)

julia> @btime ref_solver(y0, 0.1, 10);
12.779 μs (175 allocations: 15.66 KiB)

julia> @btime ref_solver_no_local(y0, 0.1, 10);
13.212 μs (175 allocations: 15.66 KiB)


Your suggestions do result in a measurable improvement in runtime and allocation. Empirically, gmres! performs 15 allocations even when using a plain matrix and vector as inputs, so that accounts for 150 allcoations, and the rest are from places we would expect allocation (e.g. when copying y). I checked this with allocation tracking.

Overall, the runtime is not improved as much as I hoped it would, but I suspect the improvement will be more significant when apply_A! is more complicated. And in any case, I like the callable struct solution better syntactically than what I was doing before.

I’ll let you know if things work out in the actual use case.

1 Like

I wonder whether this causes an allocation. Can you try with another “.”?
right_hand_side .+= 0.5 .* dt .* dydt

Same here in function (a_struct::Astruct)(out_vec, y_in):

out_vec .-= 0.5 .* a_struct.dt .* a_struct.dydt


If you wonder where the code spents its time I strongly suggest profiling it because guessing what takes time gets it wrong almost always

1 Like