DifferentialEquations.jl : supplied Jacobian solution takes more time and memory

This iteration structure is slow. Iterate along columns not rows.

Note that sparse AD will absolutely murder your handcode in performance though, since it would use a coloring vector with chunked ForwardDiff to SIMD multiple elements along the diagonal at the same time. I wouldn’t even want to show you the equivalent code because it would be nasty to write out by hand, but you’d effectively clump chunks of 8 columns at the same time and iterate down those columns, and then have another loop on top that blocks it and SIMD from that outer loop, into a denseified matrix which matches the Tridiagonal structure.

2 Likes