I would like to use sparsity to speed up an optimization problem.
I have a NxM matrix of data Z and a model m(A) = A - k*sum(A, dims=2)*sum(A,dims=1) where A is a matrix of the same dimensions of Z, sum(A, dims=2) an Nx1 vector and sum(A,dims=1) an 1xM vector so that k*sum(A, dims=2)sum(A,dims=1) is also an NxM matrix.
I want to compute the jacobian J of the model m(A) with respect to A (I’m thinking automatic differentiation). The jacobian J will be a (NxM) x (NxM) sparse metric.
I will need then to solve a linear system Jv=w, where both v and w are (NxM)x1 vectors.
(Basically for a Gauss-Newton update of A starting from some initial condition for A, until the model m(A) matches the data Z. There are some positivity constraints on A to make the problem interesting, but that’s not relevant at this stage.)
I would like to use the fact that J is sparse to speed up my computations. What’s the best approach in Julia, can you give me some pointers and maybe provide some basic examples?
First of all, in this simple case you may not need to use autodiff at all, since the Jacobian looks quite easy to compute. You can even express it as a (functional) linear operator instead of a matrix, and then use packages such as IterativeSolvers.jl, LinearSolve.jl, Krylov.jl or KrylovKit.jl to solve your linear system.
I don’t think Zygote and the like natively output sparse Jacobians, but I just discovered there are tools out there to help you do just that: maybe SparseDiffTools.jl or SparsityDetection.jl can be useful?
@Vaibhavdixit02 just added a bunch of sparsity support to Optimization.jl. @Vaibhavdixit02 , do you think you can get a tutorial up to demonstrate it for the OP here? That’s probably a good starting point, and there might be some back and forth required from that to get the solution, but at least the tools exist but need docs right now.
Thank you. I’ll get back to you soon hopefully. I was aware of SparseDiffTools.jl and quickly played with it. I believe the automatic differentiation example with forwarddiff_color_jacobian was failing in my case (might be a trivial issue, not sure). I definitely want to use automatic differentiation, since the goal is to have a tool for quick model exploration and characterization that can be more complex than the example I showed. In general I prefer to have full control of the optimization algorithm, but I’m happy to try Optimization.jl if there are some examples (is there a link?).
If A and m(A) are sparse, then each of them is represented internally by a (dense) vector of non-zero values, and a simple data structure describing the index corresponding to each non-zero value (the sparsity structure) (cf findnz).
For a given A , compute m(A) - you now have the sparsity structures of A and m(A).
Create a function which, given a vector of values of length nnz(A), builds A, computes m(A) and extract its vector of non-zero values.
This function transforms a dense vector into a dense vector. Using automatic differentiation, compute its Jacobian, a matrix of size (nnz(A),nnz(m(A))) (or was it transposed…).
Using the sparsity structures of A and m(A) you can compute the sparsity structure of your 4th order Jacobian array.
I have a small Pluto notebook with a MWE, where the forwarddiff_color_jacobian!(autoJac, s, d, colorvec = colors) will generate a BoundsError.
I can’t seem to be able to attach it here in any form (I can’t attach the html, I can’t add it as preformatted text as it’s too long, more than 32000 lines due to the included package dependency tree). Any suggestions on how to share the notebook for people to look at it?
const N1 = 8
const N2 = 6
scall = 0
function s(m,a)
global scall += 1
for i = 1:size(a)[1]
for j = 1:size(a)[2]
m[i,j] = a[i,j] - 0.1*sum(a[i,:])*sum(a[:,j])
end
end
end
function reset_scall()
global scall = 0
end
I guess one could manually vectorize and reshape the input of the function s inside the function definition? But the point was to avoid it and keep working with a matrix.
OK. Using sparsity detection from Symbolic is faster than the one from the deprecated SparsityDetection for the original nested loops s() definition. With Symbolics the error goes away if I use
m = a - 0.1*sum(a,dims=2) * sum(a,dims=1)
instead of the nested loops, but then it just returns an empty (N1N2)x(N1N2) sparse metric
In any case, nested loops are fine if needed, my main interest is still to use the sparsity pattern to speed up the auto-diff and use the sparse autodiff Jacobian in a linear system to generate a Gauss-Newton step as described in the original question. A small snippet extracting the Jacobian J and using it in a w = J\v operation (that takes advantage of sparsity) would super helpful as a reference.
Thank you for the suggestion @mohamed82008 . It looks like the Zygote jacobian extraction is orders of magnitude slower than the non sparse ForwardDiff, I’m not sure if that difference can be recovered by a sparse linear solve. I’m thinking about 60x60 matrices, and so 60x60x60x60 jacobians. That’s why I was curious about
PS I’m assuming this is intrinsic, I don’t see any cashing related improvements if I rerun the Zygote jacobian extraction multiple times, unless I’m missing something obvious.
How sparse is your Jacobian matrix? It’s possible my implementation + Zygote are adding some overhead. If the same slowdown persists for different larger matrix sizes, then it’s an intrinsic slowdown because I don’t think the overhead scales with the matrix size. If the slowdown disappears for larger matrices, then we can’t conclude if it’s intrinsic or not for smaller matrices.