Time Dependent Linear Map

leespen1 · February 14, 2024, 2:15am

I am trying to implement an initial problem. Let the system of ODEs be
\frac{d\vec{y}}{dt} = A(t;\vec{\theta})\vec{y}, \quad \vec{y}(0) = \vec{y}_0, \quad 0 \leq t \leq T.

The matrix A depends on the time t, as well a vector of control parameters \vec{\theta}.

The time-stepping method I’m using is similar to the trapezoidal rule. The timestepping is done by something like
(I - A(t_{n+1}, \vec\theta))\vec y_{n+1} = (I + A(t_n))\vec y_n

For reasons particular to my use-case, I would like the linear solves in my timestepping method to be iteratively using gmres! from the IterativeSolvers.jl package. What’s more, I would like to compute d\vec{y}/dt without actually computing the matrix A. I have a function apply_A!(dydt, y, t, theta), which compute the derivative of y and stores it in dydt.

I have been trying to do this using LinearMaps.jl, in a way like this:

using LinearMaps
using IterativeSolvers

#=
The problem being solved is 

y' = -2t*y
y(t) = y₀*exp(-t²)

The scalar equation is done for each element of y, so you can solve the initial
value problem for multiple initial conditions.
=#

function apply_A!(dydt, y_in, t)
    dydt .= (-2*t) .* y_in
    return nothing
end

function solver(y0, dt, nsteps)
    y = copy(y0)
    dydt = zeros(length(y))

    t = 0.0

    right_hand_side = zeros(length(y))

    function compute_left_hand_side!(out_vec, y_in)
        apply_A!(dydt, y_in, t)
        out_vec .= y_in
        out_vec .-= 0.5*dt .* dydt
        return nothing
    end

    LHS_map = LinearMaps.LinearMap(
        compute_left_hand_side!,
        length(y), length(y),
        ismutating=true
    )

    for n in 0:(nsteps-1)
        t = n*dt
        apply_A!(dydt, y, t)
        right_hand_side .= y
        right_hand_side .+= 0.5*dt .* dydt

        #display(y)

        t = (n+1)*dt
        gmres!(y, LHS_map, right_hand_side)
    end

    println("Final t = ", t)
    println("Final y = ", y)

    return y
end

I have something like this implemented, but my method is pretty slow, makes many more memory allocations than are reasonable, and I have tracked down that the vast majority of the memory allocations do not come from any of the code I have written (i.e. they are from gmres or the linear map, in the way that I have used them).

I am not very experienced with debugging type-allocation, but it seemed to me from the beginning that this might be a bad idea, since I create a function inside a function, which I then pass to gmres!. And so the function gmres! takes may not be known until runtime, messing with the compilation process. Also, compute_left_hand_side relies on variables outside the scope of that function. Although those variables are technically not global variables, which I know people generally advise against using, I fear it may be similarly bad for performance to use variables outside the function scope like I have here.

Does anyone have a “right” way to do this? I feel like I am working against the intended use of these packages, but I can’t think of another way to accomplish this. If there are packages besides LinearMaps and IterativeSolvers which would be more appropriate for this use-case, I am happy to switch.

EDIT

Changed code mock-up to a MWE (and one that is actually algorithmically correct).

abraemer · February 14, 2024, 7:59am

Generally, a runnable example makes these things much easier for others to debug

That being said, I am pretty sure you run into performance issues due to captured variables. More concretely, you define the closure compute_left_hand_side! that captures t, theta and dydt. Out of these theta appears to be constant (is never assigned to) but t and dydt certainly are assigned to and thus will cause issues. A quick fix would be to convert t and dydt into Refs such that the content of the variable is constant (the Ref) and you just mutate what it points to:

function solver(y0, theta, t0, dt, nsteps)
    y = copy(y0)
    dydt = Ref(zeros(length(y)))

    t = Ref(0.0)

    compute_left_hand_side! = let t=t, dydt = dydt, theta=theta # convert to locals to help the compiler, probably unnecessary though
      function compute_left_hand_side!(out_vec, y_in)
          apply_A!(dydt[], y_in, t[], theta)
          out_vec .= y_in
          dydt[] .+= dydt[] # what does this line actually do? Are you sure it is correct?
          return nothing
      end
    end

    LHS_map = LinearMaps.LinearMap(
        compute_left_hand_side!,
        length(y), length(y),
        ismutating=true
    )

    for n in 0:nsteps
        t[] = n*dt
        apply_A!(dydt[], y, t_n, theta) # should t_n be t?
        right_hand_side = y + dydt[]

        t[] = (n+1)*dt
        gmres!(y, LHS_map, right_hand_side)
    end

    return y
end

In general you could perhaps rethink the code a bit and extract the locals that are constantly overwritten to a struct that when called uses its fields to compute_left_hand_side!. That could also be simpler to understand.

leespen1 · February 14, 2024, 2:57pm

You’re right, my bad. I’ve added a runnable example.

leespen1 · February 14, 2024, 3:03pm

Thanks! I did not know about references or callable structs. I will try doing it with callable structs and report back how that affects performance (your suggestion of using references also makes sense to me, but I am terrified that I will screw something up if I do that in my actual use case).

leespen1 · February 14, 2024, 4:13pm

I have implemented the original solver, another based on callable structs, another based on references, and another based on references but without the ‘let’ statement.

using LinearMaps
using IterativeSolvers

#=
The problem being solved is 

y' = -2t*y
y(t) = y₀*exp(-t²)

The scalar equation is done for each element of y, so you can solve the initial
value problem for multiple initial conditions.
=#

function apply_A!(dydt, y_in, t)
    dydt .= (-2*t) .* y_in
    return nothing
end

function solver(y0, dt, nsteps)
    y = copy(y0)
    dydt = zeros(length(y))

    t = 0.0

    right_hand_side = zeros(length(y))

    function compute_left_hand_side!(out_vec, y_in)
        apply_A!(dydt, y_in, t)
        out_vec .= y_in
        out_vec .-= 0.5*dt .* dydt
        return nothing
    end

    LHS_map = LinearMaps.LinearMap(
        compute_left_hand_side!,
        length(y), length(y),
        ismutating=true
    )

    for n in 0:(nsteps-1)
        t = n*dt
        apply_A!(dydt, y, t)
        right_hand_side .= y
        right_hand_side .+= 0.5*dt .* dydt

        #display(y)

        t = (n+1)*dt
        gmres!(y, LHS_map, right_hand_side)
    end

    return y
end

# Callable struct version
mutable struct Astruct
    t::Float64
    dt::Float64
    dydt::Vector{Float64}
    function Astruct(t::Float64, dt::Float64, N::Int64)
        dydt = zeros(N)
        new(t, dt, dydt)
    end
end

function (a_struct::Astruct)(out_vec, y_in)
        apply_A!(a_struct.dydt, y_in, a_struct.t)
        out_vec .= y_in
        out_vec .-= 0.5*a_struct.dt .* a_struct.dydt
        return nothing
end

function struct_solver(y0, dt, nsteps)
    y = copy(y0)
    dydt = zeros(length(y))

    right_hand_side = zeros(length(y))

    t0 = 0.0
    a_struct = Astruct(t0, dt, length(y))


    LHS_map = LinearMaps.LinearMap(
        a_struct,
        length(y), length(y),
        ismutating=true
    )

    for n in 0:(nsteps-1)
        t = n*dt
        apply_A!(dydt, y, t)
        right_hand_side .= y
        right_hand_side .+= 0.5*dt .* dydt

        a_struct.t = (n+1)*dt
        gmres!(y, LHS_map, right_hand_side)
    end

    return y
end


function ref_solver(y0, dt, nsteps)
    y = copy(y0)
    dydt = Ref(zeros(length(y)))
    local_dt = Ref(dt)

    t = Ref(0.0)

    right_hand_side = zeros(length(y))

    compute_left_hand_side! = let t=t, dydt = dydt, local_dt = local_dt # convert to locals to help the compiler, probably unnecessary though
      function compute_left_hand_side!(out_vec, y_in)
          apply_A!(dydt[], y_in, t[])
          out_vec .= y_in
          dydt[] .-= 0.5*dt[] .* dydt[] 
        return nothing
      end
    end


    LHS_map = LinearMaps.LinearMap(
        compute_left_hand_side!,
        length(y), length(y),
        ismutating=true
    )

    for n in 0:(nsteps-1)
        t[] = n*dt
        apply_A!(dydt[], y, t[])
        right_hand_side .= y
        right_hand_side .+= 0.5*dt .* dydt[]

        t[] = (n+1)*local_dt[]
        gmres!(y, LHS_map, right_hand_side)
    end

    return y
end

function ref_solver_no_local(y0, dt, nsteps)
    y = copy(y0)
    dydt = Ref(zeros(length(y)))
    local_dt = Ref(dt)

    t = Ref(0.0)

    right_hand_side = zeros(length(y))

    function compute_left_hand_side!(out_vec, y_in)
        apply_A!(dydt[], y_in, t[])
        out_vec .= y_in
        dydt[] .-= 0.5*dt[] .* dydt[] 
      return nothing
    end


    LHS_map = LinearMaps.LinearMap(
        compute_left_hand_side!,
        length(y), length(y),
        ismutating=true
    )

    for n in 0:(nsteps-1)
        t[] = n*dt
        apply_A!(dydt[], y, t[])
        right_hand_side .= y
        right_hand_side .+= 0.5*dt .* dydt[]

        t[] = (n+1)*local_dt[]
        gmres!(y, LHS_map, right_hand_side)
    end

    return y
end

julia> include("examples/timestepping_structure.jl")
ref_solver_no_local (generic function with 1 method)

julia> using BenchmarkTools

julia> y0 = ones(3)
3-element Vector{Float64}:
 1.0
 1.0
 1.0

julia> @btime solver(y0, 0.1, 10);
  13.587 μs (205 allocations: 16.50 KiB)

julia> @btime struct_solver(y0, 0.1, 10);
  12.897 μs (175 allocations: 15.30 KiB)

julia> @btime ref_solver(y0, 0.1, 10);
  12.779 μs (175 allocations: 15.66 KiB)

julia> @btime ref_solver_no_local(y0, 0.1, 10);
  13.212 μs (175 allocations: 15.66 KiB)

Your suggestions do result in a measurable improvement in runtime and allocation. Empirically, gmres! performs 15 allocations even when using a plain matrix and vector as inputs, so that accounts for 150 allcoations, and the rest are from places we would expect allocation (e.g. when copying y). I checked this with allocation tracking.

Overall, the runtime is not improved as much as I hoped it would, but I suspect the improvement will be more significant when apply_A! is more complicated. And in any case, I like the callable struct solution better syntactically than what I was doing before.

I’ll let you know if things work out in the actual use case.

abraemer · February 14, 2024, 5:43pm

I wonder whether this causes an allocation. Can you try with another “.”?
right_hand_side .+= 0.5 .* dt .* dydt

Same here in function (a_struct::Astruct)(out_vec, y_in):

out_vec .-= 0.5 .* a_struct.dt .* a_struct.dydt

If you wonder where the code spents its time I strongly suggest profiling it because guessing what takes time gets it wrong almost always

Topic		Replies	Views
IterativeSolvers.jl not working as expected with mutating and allocation General Usage benchmarktools , preallocation , iterative-solvers	9	337	March 5, 2024
Type Inference Issues in IterativeSolvers with LinearMaps Performance package , packages , profiling , type-stability , cthulhu	2	220	November 9, 2023
Unified Interface for Linear Solving Numerics	34	6069	January 25, 2017
Unbounded IterativeSolvers.Gmres! different output to MATLAB, slows convergence significantly Performance	13	1876	March 7, 2019
Reduce allocations LinearSolve Performance	4	308	September 22, 2023

Time Dependent Linear Map

EDIT

Related topics