Time Dependent Linear Map

I am trying to implement an initial problem. Let the system of ODEs be
\frac{d\vec{y}}{dt} = A(t;\vec{\theta})\vec{y}, \quad \vec{y}(0) = \vec{y}_0, \quad 0 \leq t \leq T.

The matrix A depends on the time t, as well a vector of control parameters \vec{\theta}.

The time-stepping method I’m using is similar to the trapezoidal rule. The timestepping is done by something like
(I - A(t_{n+1}, \vec\theta))\vec y_{n+1} = (I + A(t_n))\vec y_n

For reasons particular to my use-case, I would like the linear solves in my timestepping method to be iteratively using gmres! from the IterativeSolvers.jl package. What’s more, I would like to compute d\vec{y}/dt without actually computing the matrix A. I have a function apply_A!(dydt, y, t, theta), which compute the derivative of y and stores it in dydt.

I have been trying to do this using LinearMaps.jl, in a way like this:

using LinearMaps
using IterativeSolvers

#=
The problem being solved is 

y' = -2t*y
y(t) = y₀*exp(-t²)

The scalar equation is done for each element of y, so you can solve the initial
value problem for multiple initial conditions.
=#

function apply_A!(dydt, y_in, t)
    dydt .= (-2*t) .* y_in
    return nothing
end

function solver(y0, dt, nsteps)
    y = copy(y0)
    dydt = zeros(length(y))

    t = 0.0

    right_hand_side = zeros(length(y))

    function compute_left_hand_side!(out_vec, y_in)
        apply_A!(dydt, y_in, t)
        out_vec .= y_in
        out_vec .-= 0.5*dt .* dydt
        return nothing
    end

    LHS_map = LinearMaps.LinearMap(
        compute_left_hand_side!,
        length(y), length(y),
        ismutating=true
    )

    for n in 0:(nsteps-1)
        t = n*dt
        apply_A!(dydt, y, t)
        right_hand_side .= y
        right_hand_side .+= 0.5*dt .* dydt

        #display(y)

        t = (n+1)*dt
        gmres!(y, LHS_map, right_hand_side)
    end

    println("Final t = ", t)
    println("Final y = ", y)

    return y
end

I have something like this implemented, but my method is pretty slow, makes many more memory allocations than are reasonable, and I have tracked down that the vast majority of the memory allocations do not come from any of the code I have written (i.e. they are from gmres or the linear map, in the way that I have used them).

I am not very experienced with debugging type-allocation, but it seemed to me from the beginning that this might be a bad idea, since I create a function inside a function, which I then pass to gmres!. And so the function gmres! takes may not be known until runtime, messing with the compilation process. Also, compute_left_hand_side relies on variables outside the scope of that function. Although those variables are technically not global variables, which I know people generally advise against using, I fear it may be similarly bad for performance to use variables outside the function scope like I have here.

Does anyone have a “right” way to do this? I feel like I am working against the intended use of these packages, but I can’t think of another way to accomplish this. If there are packages besides LinearMaps and IterativeSolvers which would be more appropriate for this use-case, I am happy to switch.

EDIT

Changed code mock-up to a MWE (and one that is actually algorithmically correct).

Generally, a runnable example makes these things much easier for others to debug :slight_smile:

That being said, I am pretty sure you run into performance issues due to captured variables. More concretely, you define the closure compute_left_hand_side! that captures t, theta and dydt. Out of these theta appears to be constant (is never assigned to) but t and dydt certainly are assigned to and thus will cause issues. A quick fix would be to convert t and dydt into Refs such that the content of the variable is constant (the Ref) and you just mutate what it points to:

function solver(y0, theta, t0, dt, nsteps)
    y = copy(y0)
    dydt = Ref(zeros(length(y)))

    t = Ref(0.0)

    compute_left_hand_side! = let t=t, dydt = dydt, theta=theta # convert to locals to help the compiler, probably unnecessary though
      function compute_left_hand_side!(out_vec, y_in)
          apply_A!(dydt[], y_in, t[], theta)
          out_vec .= y_in
          dydt[] .+= dydt[] # what does this line actually do? Are you sure it is correct?
          return nothing
      end
    end

    LHS_map = LinearMaps.LinearMap(
        compute_left_hand_side!,
        length(y), length(y),
        ismutating=true
    )

    for n in 0:nsteps
        t[] = n*dt
        apply_A!(dydt[], y, t_n, theta) # should t_n be t?
        right_hand_side = y + dydt[]

        t[] = (n+1)*dt
        gmres!(y, LHS_map, right_hand_side)
    end

    return y
end

In general you could perhaps rethink the code a bit and extract the locals that are constantly overwritten to a struct that when called uses its fields to compute_left_hand_side!. That could also be simpler to understand.

1 Like

You’re right, my bad. I’ve added a runnable example.

Thanks! I did not know about references or callable structs. I will try doing it with callable structs and report back how that affects performance (your suggestion of using references also makes sense to me, but I am terrified that I will screw something up if I do that in my actual use case).

I have implemented the original solver, another based on callable structs, another based on references, and another based on references but without the ‘let’ statement.

using LinearMaps
using IterativeSolvers

#=
The problem being solved is 

y' = -2t*y
y(t) = y₀*exp(-t²)

The scalar equation is done for each element of y, so you can solve the initial
value problem for multiple initial conditions.
=#

function apply_A!(dydt, y_in, t)
    dydt .= (-2*t) .* y_in
    return nothing
end

function solver(y0, dt, nsteps)
    y = copy(y0)
    dydt = zeros(length(y))

    t = 0.0

    right_hand_side = zeros(length(y))

    function compute_left_hand_side!(out_vec, y_in)
        apply_A!(dydt, y_in, t)
        out_vec .= y_in
        out_vec .-= 0.5*dt .* dydt
        return nothing
    end

    LHS_map = LinearMaps.LinearMap(
        compute_left_hand_side!,
        length(y), length(y),
        ismutating=true
    )

    for n in 0:(nsteps-1)
        t = n*dt
        apply_A!(dydt, y, t)
        right_hand_side .= y
        right_hand_side .+= 0.5*dt .* dydt

        #display(y)

        t = (n+1)*dt
        gmres!(y, LHS_map, right_hand_side)
    end

    return y
end

# Callable struct version
mutable struct Astruct
    t::Float64
    dt::Float64
    dydt::Vector{Float64}
    function Astruct(t::Float64, dt::Float64, N::Int64)
        dydt = zeros(N)
        new(t, dt, dydt)
    end
end

function (a_struct::Astruct)(out_vec, y_in)
        apply_A!(a_struct.dydt, y_in, a_struct.t)
        out_vec .= y_in
        out_vec .-= 0.5*a_struct.dt .* a_struct.dydt
        return nothing
end

function struct_solver(y0, dt, nsteps)
    y = copy(y0)
    dydt = zeros(length(y))

    right_hand_side = zeros(length(y))

    t0 = 0.0
    a_struct = Astruct(t0, dt, length(y))


    LHS_map = LinearMaps.LinearMap(
        a_struct,
        length(y), length(y),
        ismutating=true
    )

    for n in 0:(nsteps-1)
        t = n*dt
        apply_A!(dydt, y, t)
        right_hand_side .= y
        right_hand_side .+= 0.5*dt .* dydt

        a_struct.t = (n+1)*dt
        gmres!(y, LHS_map, right_hand_side)
    end

    return y
end


function ref_solver(y0, dt, nsteps)
    y = copy(y0)
    dydt = Ref(zeros(length(y)))
    local_dt = Ref(dt)

    t = Ref(0.0)

    right_hand_side = zeros(length(y))

    compute_left_hand_side! = let t=t, dydt = dydt, local_dt = local_dt # convert to locals to help the compiler, probably unnecessary though
      function compute_left_hand_side!(out_vec, y_in)
          apply_A!(dydt[], y_in, t[])
          out_vec .= y_in
          dydt[] .-= 0.5*dt[] .* dydt[] 
        return nothing
      end
    end


    LHS_map = LinearMaps.LinearMap(
        compute_left_hand_side!,
        length(y), length(y),
        ismutating=true
    )

    for n in 0:(nsteps-1)
        t[] = n*dt
        apply_A!(dydt[], y, t[])
        right_hand_side .= y
        right_hand_side .+= 0.5*dt .* dydt[]

        t[] = (n+1)*local_dt[]
        gmres!(y, LHS_map, right_hand_side)
    end

    return y
end

function ref_solver_no_local(y0, dt, nsteps)
    y = copy(y0)
    dydt = Ref(zeros(length(y)))
    local_dt = Ref(dt)

    t = Ref(0.0)

    right_hand_side = zeros(length(y))

    function compute_left_hand_side!(out_vec, y_in)
        apply_A!(dydt[], y_in, t[])
        out_vec .= y_in
        dydt[] .-= 0.5*dt[] .* dydt[] 
      return nothing
    end


    LHS_map = LinearMaps.LinearMap(
        compute_left_hand_side!,
        length(y), length(y),
        ismutating=true
    )

    for n in 0:(nsteps-1)
        t[] = n*dt
        apply_A!(dydt[], y, t[])
        right_hand_side .= y
        right_hand_side .+= 0.5*dt .* dydt[]

        t[] = (n+1)*local_dt[]
        gmres!(y, LHS_map, right_hand_side)
    end

    return y
end

julia> include("examples/timestepping_structure.jl")
ref_solver_no_local (generic function with 1 method)

julia> using BenchmarkTools

julia> y0 = ones(3)
3-element Vector{Float64}:
 1.0
 1.0
 1.0

julia> @btime solver(y0, 0.1, 10);
  13.587 μs (205 allocations: 16.50 KiB)

julia> @btime struct_solver(y0, 0.1, 10);
  12.897 μs (175 allocations: 15.30 KiB)

julia> @btime ref_solver(y0, 0.1, 10);
  12.779 μs (175 allocations: 15.66 KiB)

julia> @btime ref_solver_no_local(y0, 0.1, 10);
  13.212 μs (175 allocations: 15.66 KiB)

Your suggestions do result in a measurable improvement in runtime and allocation. Empirically, gmres! performs 15 allocations even when using a plain matrix and vector as inputs, so that accounts for 150 allcoations, and the rest are from places we would expect allocation (e.g. when copying y). I checked this with allocation tracking.

Overall, the runtime is not improved as much as I hoped it would, but I suspect the improvement will be more significant when apply_A! is more complicated. And in any case, I like the callable struct solution better syntactically than what I was doing before.

I’ll let you know if things work out in the actual use case.

1 Like

I wonder whether this causes an allocation. Can you try with another “.”?
right_hand_side .+= 0.5 .* dt .* dydt

Same here in function (a_struct::Astruct)(out_vec, y_in):

out_vec .-= 0.5 .* a_struct.dt .* a_struct.dydt

If you wonder where the code spents its time I strongly suggest profiling it because guessing what takes time gets it wrong almost always :slight_smile:

1 Like