Improvements for solving an ODE "piecewise"

Consider the following problem: I have a linear ODE x' = M(t) x, but at some times t_i during the integration interval I need to apply some operation on the current state, i.e x_\text{new}(t_i) = A \cdot x(t).

A little toy model with DifferentialEquations.jl:

using DifferentialEquations
using LinearAlgebra
using Random
using TimerOutputs

# +++ DEFINE ODE +++
# vectorized form of equation x' = M * x, where x is Matrix
Random.seed!(25)
M = kron(I(10), rand(Float64, (10, 10)))

function f(du, u, p, t)
    mul!(du, M, u)
    lmul!(sin(p * t), du)
end

where we want to integrate across the intervall [0, 1000].

The version without the additional operations looks like the following

@time begin
u0 = vec(rand(Float64, (10, 10)))
t_start = 0
t_end = 10000
reset_timer!()
@timeit "Problem definition" prob = ODEProblem(f, u0, (t_start , t_end), 0.7)
@timeit "Solve" sol = solve(prob, save_everystep=false, save_start=false)
print_timer()
end
───────────────────────────────────────────────────────────────────────────────
                                       Time                    Allocations      
                              ───────────────────────   ────────────────────────
       Tot / % measured:           14.9s / 100.0%           7.11MiB / 100.0%    

 Section              ncalls     time    %tot     avg     alloc    %tot      avg
 ───────────────────────────────────────────────────────────────────────────────
 Solve                     1    14.9s  100.0%   14.9s   7.10MiB   99.9%  7.10MiB
 Problem definition        1   95.3μs    0.0%  95.3μs   4.44KiB    0.1%  4.44KiB
 ───────────────────────────────────────────────────────────────────────────────
 14.904764 seconds (454.00 k allocations: 7.141 MiB)

while the model with additnal operations solves for times between two subsequent opterations t_n \rightarrow t_{n+1} piecewise.

@time begin
reset_timer!()

u0 = vec(rand(Float64, (10, 10)))
storage = Array{Float64}(undef, size(u0))
operation = kron(I(10), 0.1 * diagm([1, -1, -1, 1, 1, -1, -1, 1, 1, -1]))
for i=0:9999
    t_start = i
    t_end = i + 1
    @timeit "Problem definition" prob = ODEProblem(f, u0, (t_start , t_end), 0.7)
    @timeit "Solve" sol = solve(prob, save_everystep=false, save_start=false)
    u0 = sol.u[end]
    u0 = operation * u0
end

print_timer()
end
 ───────────────────────────────────────────────────────────────────────────────
                                       Time                    Allocations      
                              ───────────────────────   ────────────────────────
       Tot / % measured:           23.6s /  97.1%           1.90GiB /  99.4%    

 Section              ncalls     time    %tot     avg     alloc    %tot      avg
 ───────────────────────────────────────────────────────────────────────────────
 Solve                 10.0k    22.5s   98.4%  2.25ms   1.85GiB   97.9%   194KiB
 Problem definition    10.0k    370ms    1.6%  37.0μs   41.2MiB    2.1%  4.22KiB
 ───────────────────────────────────────────────────────────────────────────────
 23.587671 seconds (2.91 M allocations: 1.900 GiB, 0.59% gc time)

It seems like the allocations scale with the number of partitions of the interval and the runtime also increases a lot. As I have quite a lot of these external operations, this creates some trouble for me.

In the latter case, I had to redefine the ODEProblem and the solve-operation for every iteration and I feel there must be a more clever method to do this, as all integrations are fairly similar (only the time interval changes here).

So my question: Is there a better way to solve this?

The build-in solution for that are callbacks, see for example Timed Callbacks · DiffEqCallbacks.jl or Event Handling and Callback Functions · DifferentialEquations.jl