Well I think I could, it’s just a bit more complicated than I anticipated because my out-of-place Jacobian function drops zeros (the non-structural entries that are sometimes zero). This is both good (because there are quite a few) and bad (because some branching inside the nonlinear function means some non-structural zeros appear and disappear and make it hard to use for an inplace version).
For the record and because it helps me think about, I’ll desribe my Jacobian function here and I might eventually copy this into the package docs anyway. Also maybe someone has good ideas to improve it and make it inplace, too, so here goes. This goes in the details so it is a little wordy — so apologies in advance.
TLDR: I have a function F that looks like F(x) = G(x) - T x where G is a perfect candidate for ForwardDiff and SparseDiffTools, while T is already available as the Jacobian of x \mapsto T x. How can I make an inplace Jacobian for J(x) = \nabla G(x) - T?
Long version:
When trying to solve for F(x) = 0, the variable x is actually a concatenation of smaller (but still large) vectors:
where each x_i can be a ~200,000 sized column vector.
All the x_i have the same length and share the same indices, in the sense that these indices correspond to some linear indices of a 3D grid. Each different x_i thus actually represents a different 3D field that has been “vectorized”. The function F can be similarly split
and each F_i is further split into a “nonloncal” linear part and a “local” nonlinear part, like so
where the T_i are sparse matrices but not diagonal, and their sparsity structure does not correspond to any of the sparse array packages I have seen (they have 1 to ~20 off-diagonals that can be quite far off the main diagonal and whose distance to the diagonal can vary with each row/column). The G_i, on the other hand, are nonlinear, but local, in that the j-th entry [G_i(x_1, \ldots , x_N)]_j only depends on the j-th entries of its x_i arguments. The AIBECS.jl user supplies the T_i and the G_i functions, which I then use to build F and its Jacobian as a function of a single larger vector x.
The jacobian of F can be built efficiently using autodifferentiation of the G_i only (ForwardDiff.jl in my case). If I denote by \partial_j G_i the “derivative” of G_i with respect to x_j to be the (square) jacobian of the x_j \mapsto G_i(x_1,\ldots,x_N) function, which is diagonal because the G_i are “local” as explained just above, then jacobian of F can be written as
In practice, I actually build it with
where N ForwardDiff’d calls to the G_i's are sufficient to build the first term (which is essentially an array of diagonals) with some combination of sparse
, Diagonal
, hcat
, and vcat
operations, and where the second term uses the blockdiag
function on the T_i. I believe the -
operation drops all the non-structural zeros (but I need to double check), which is good and bad, as mentioned in the intro to this post.
I could use SparseDiffTools for the first term by supplying the sparsity pattern, but it is not efficient to use it on the whole F because the T_i fill in a lot of the sparsity pattern and I actually don’t need to use ForwardDiff on the T_i x_i terms since these are linear and the T_i are already provided.
So how should I go about making an inplace version of this Jacobian?