I have a large-scale continuation problem for which I’m considering using matrix-free methods in Bifurcationkit.jl. To evaluate stability and detect bifurcations, I implemented a matrix-free, shift-inverting eigensolver in the same fashion as the Swift-Hohenberg 3D tutorial, but I notice that it’s considerably slower than a matrix-based one. For instance, if I run the following lines of code in the tutorial,
using Arpack
@time for i = 1:5 eigSH3d(dx -> L1 * dx, 3) end
@time for i = 1:5 Arpack.eigs(L1; nev = 3, sigma = 0.1) end
the 2nd option is nearly twice as fast. Furthermore, because it seems as though the eigenvalue computation is the slowest step in computing the bifurcation diagram, matrix-based methods seem to be faster than matrix-free versions. I tested this by running the following lines of code.
#Jacobian
function J_sh(u, p)
@unpack l, ν, L1 = p
return -L1 .+ spdiagm(0 => l .+ 2 .* ν .* u .- 3 .* u.^2)
end
prob = BK.BifurcationProblem(F_sh, AF(vec(sol0)), par, (@lens _.l),
J = J_sh,
recordFromSolution = (x, p) -> (n2 = norm(x), n8 = norm(x, 8)),
issymmetric = true)
#Newton options with new eigsolver
optnew = NewtonPar(verbose = true, tol = 1e-8, maxIter = 20)
@set! optnew.eigsolver = EigArpack(0.1, :LM, tol = 1e-12)
sol_hexa = @time newton(prob, optnew)
println("--> norm(sol) = ", norm(sol_hexa.u, Inf64))
#Continuation parameters
optcont = ContinuationPar(dsmin = 0.0001, dsmax = 0.005, ds= -0.001, pMax = 0.15, pMin = -.1, newtonOptions = setproperties(optnew; tol = 1e-9, maxIter = 15), maxSteps = 146, detectBifurcation = 3, nev = 15, nInversion = 4, plotEveryStep = 1)
#Continuation problem
br = @time continuation(
reMake(prob, u0 = 0prob.u0), PALC(tangent = Bordered(), bls = BorderingBLS(solver = optnew.linsolver, checkPrecision = false)), optcont;
plot = false, verbosity = 1,
normC = norminf,
event = BK.FoldDetectEvent,
# finaliseSolution = (z, tau, step, contResult; k...) -> begin
# if length(contResult) == 1
# pretty_table(contResult.branch)
# else
# pretty_table(contResult.branch, overwrite = true,)
# end
# true
# end
)
As far as I can tell, the main source of slowness is in the so-called “inner loop” required to shift and invert the Jacobian. My understanding is that, in Arpack.jl, the Jacobian is factorized once and then the system is solved directly using \
during each Arnoldi iteration, but the matrix-free version solves the system using a preconditioned linear solver during each iteration, which scales worse with increasing iterations. Here’s some code that demonstrates this
# Matrix-based procedure of Arpack.eigs - computes appropriate factorization
# and passes \ to inner Arnoldi iteration
t1 = time()
P = factorize(L1)
u0 = vec(sol0)
for i = 1:20 Plu \ u0 end
t2 = time()
dt_M = t2 - t1
# Matrix-free procedure - uses preconditioned linear solver
# in inner arnoldi iteration
t3 = time()
for i = 1:20 ls(dx -> L1 * dx, u0) end
t4 = time()
dt_MF = t4 - t3
print("Matrix-free: "*string(dt_MF)*" s")
print("Matrix-based: "*string(dt_M)*" s")
Matrix-free: 1.9449999332427979 s
Matrix-based: 1.0419998168945312 s
Thus, I imagine that the way to optimize this is to either find some way to speed up the “inner” solution (i.e. fine-tune the linear solver somehow) or to find some alternative to the shift-invert method for computing interior eigenvalues (i.e. one that bypasses the need for an inner loop).
Has anyone else dabbled with this sort of thing? Your help would be greatly appreciated .