Sparse jacobians of matrix models

gianmariomanca · June 10, 2022, 2:36am

Hi all,

I would like to use sparsity to speed up an optimization problem.

I have a NxM matrix of data Z and a model m(A) = A - k*sum(A, dims=2)*sum(A,dims=1) where A is a matrix of the same dimensions of Z, sum(A, dims=2) an Nx1 vector and sum(A,dims=1) an 1xM vector so that k*sum(A, dims=2)sum(A,dims=1) is also an NxM matrix.

I want to compute the jacobian J of the model m(A) with respect to A (I’m thinking automatic differentiation). The jacobian J will be a (NxM) x (NxM) sparse metric.

I will need then to solve a linear system Jv=w, where both v and w are (NxM)x1 vectors.
(Basically for a Gauss-Newton update of A starting from some initial condition for A, until the model m(A) matches the data Z. There are some positivity constraints on A to make the problem interesting, but that’s not relevant at this stage.)

I would like to use the fact that J is sparse to speed up my computations. What’s the best approach in Julia, can you give me some pointers and maybe provide some basic examples?

gdalle · June 10, 2022, 4:35am

Hi!

First of all, in this simple case you may not need to use autodiff at all, since the Jacobian looks quite easy to compute. You can even express it as a (functional) linear operator instead of a matrix, and then use packages such as IterativeSolvers.jl, LinearSolve.jl, Krylov.jl or KrylovKit.jl to solve your linear system.

I don’t think Zygote and the like natively output sparse Jacobians, but I just discovered there are tools out there to help you do just that: maybe SparseDiffTools.jl or SparsityDetection.jl can be useful?

ChrisRackauckas · June 10, 2022, 4:50am

@Vaibhavdixit02 just added a bunch of sparsity support to Optimization.jl. @Vaibhavdixit02 , do you think you can get a tutorial up to demonstrate it for the OP here? That’s probably a good starting point, and there might be some back and forth required from that to get the solution, but at least the tools exist but need docs right now.

gianmariomanca · June 13, 2022, 6:31pm

Thank you. I’ll get back to you soon hopefully. I was aware of SparseDiffTools.jl and quickly played with it. I believe the automatic differentiation example with forwarddiff_color_jacobian was failing in my case (might be a trivial issue, not sure). I definitely want to use automatic differentiation, since the goal is to have a tool for quick model exploration and characterization that can be more complex than the example I showed. In general I prefer to have full control of the optimization algorithm, but I’m happy to try Optimization.jl if there are some examples (is there a link?).

ChrisRackauckas · June 14, 2022, 3:08am

What do you mean by “failing in my case”? MWE?

Philippe_Maincon1 · June 14, 2022, 8:13am

If A and m(A) are sparse, then each of them is represented internally by a (dense) vector of non-zero values, and a simple data structure describing the index corresponding to each non-zero value (the sparsity structure) (cf findnz).

For a given A , compute m(A) - you now have the sparsity structures of A and m(A).

Create a function which, given a vector of values of length nnz(A), builds A, computes m(A) and extract its vector of non-zero values.

This function transforms a dense vector into a dense vector. Using automatic differentiation, compute its Jacobian, a matrix of size (nnz(A),nnz(m(A))) (or was it transposed…).

Using the sparsity structures of A and m(A) you can compute the sparsity structure of your 4th order Jacobian array.

I think…?

gianmariomanca · June 24, 2022, 5:04pm

Hi @ChrisRackauckas , I apologize for the delay.

I have a small Pluto notebook with a MWE, where the forwarddiff_color_jacobian!(autoJac, s, d, colorvec = colors) will generate a BoundsError.

I can’t seem to be able to attach it here in any form (I can’t attach the html, I can’t add it as preformatted text as it’s too long, more than 32000 lines due to the included package dependency tree). Any suggestions on how to share the notebook for people to look at it?

gianmariomanca · June 24, 2022, 8:00pm

I guess the only relevant piece of code is:

	const N1 = 8
	const N2 = 6
	scall = 0
	function s(m,a)
		global scall += 1
		for i = 1:size(a)[1]
			for j = 1:size(a)[2]
				m[i,j] = a[i,j] - 0.1*sum(a[i,:])*sum(a[:,j])
			end
		end
	end
	function reset_scall()
		global scall = 0
    end

and the question basically boils down to creating a working example for forwarddiff_color_jacobian! along the lines of the example in GitHub - JuliaDiff/SparseDiffTools.jl: Fast jacobian computation through sparsity exploitation and matrix coloring for a 2D matrix input.

ChrisRackauckas · June 24, 2022, 9:21pm

Just vec the input?

gianmariomanca · June 24, 2022, 9:31pm

If I vec(d) the input then the definition of the function s doesn’t make sense and you get another error.

I guess one could manually vectorize and reshape the input of the function s inside the function definition? But the point was to avoid it and keep working with a matrix.

gianmariomanca · June 24, 2022, 9:45pm

Also that didn’t work:

going from

function s(m,a)

to

function s(m,b)
a = reshape(b,N1,N2)

and passing a vectorized input still fails.

ChrisRackauckas · June 24, 2022, 10:10pm

Can you try to build an example?

gianmariomanca · June 24, 2022, 10:33pm

Here it is, you should be able to open it as Pluto notebook or just as a Julia script:

### A Pluto.jl notebook ###
# v0.19.9

using Markdown
using InteractiveUtils

# ╔═╡ 690f6163-7a99-4b3e-bb70-47775b14dfb4
using SparsityDetection, SparseArrays

# ╔═╡ e8b4b6c5-1871-4675-b66a-97008f1b23c8
using SparseDiffTools

# ╔═╡ 55afd5f5-504c-4574-bd03-58fa9a5e09cc
using Plots

# ╔═╡ bfb90258-de91-4955-adcf-7c93f79fdefe
using FiniteDiff

# ╔═╡ 526d04e1-e3d0-4b06-b3b2-d55e2fc73bec
using IterativeSolvers

# ╔═╡ 5feaede9-0bd3-4e66-9063-ce828eaf6f7e
begin
	const N1 = 8
	const N2 = 6
end

# ╔═╡ 59a70fac-01e1-429f-81bf-06ab74b1c3a7
begin
	scall = 0
	function s(m,b)
		a = reshape(b,N1,N2)
		global scall += 1
		for i = 1:size(a)[1]
			for j = 1:size(a)[2]
				m[i,j] = a[i,j] - 0.1*sum(a[i,:])*sum(a[:,j])
			end
		end
	end
	function reset_scall()
		global scall = 0
    end
end

# ╔═╡ 38b541da-e906-4af4-8736-42e3c65e79cf
d = rand(N1,N2)

# ╔═╡ cd74031b-ebd1-4e18-89d4-e8f68f101a89
k = reshape(d,N1,N2)

# ╔═╡ 4b4f9102-ea56-4c92-9655-34771e32a591
o = similar(d);

# ╔═╡ 8494a376-222f-4ca7-abe4-233ec037b3ba
sparsity_pattern = jacobian_sparsity(s,o,d)

# ╔═╡ e47c9391-0e27-4c54-8c3a-2d58f2935759
jac = Float64.(sparse(sparsity_pattern))

# ╔═╡ 0234b949-3520-4853-9018-5f3b96c803fd
colors = matrix_colors(jac)

# ╔═╡ c5aa54b7-e9bd-45c6-9673-44e9bb215f9a
begin
	reset_scall()
	FiniteDiff.finite_difference_jacobian!(jac, s, d, colorvec=colors)
	@show(scall)
end

# ╔═╡ e4116e71-d638-424a-a319-0b8cd084bade
jac\vec(d)

# ╔═╡ ab015e88-631d-4cfe-bf09-400bedb25f20
j = vec(jac)

# ╔═╡ d4ab28fb-9124-420d-8f99-3b099c73f6ed
scatter(j)

# ╔═╡ d80b1f58-8dd0-4a87-a6a2-3038e4f3da2a
heatmap(Matrix(jac))

# ╔═╡ 867a6b3a-9a6d-4c94-94d9-5a63a642b028
autoJac = zeros(N1*N2,N1*N2)

# ╔═╡ f20c670a-9f86-4def-97e1-79e8a2f63bf6
vec(d)

# ╔═╡ 3be2e06f-e005-487d-921e-d2d56702599a
dv = deepcopy(vec(d))

# ╔═╡ 6741b47d-4513-4dcb-aec8-186c03c24ccc
forwarddiff_color_jacobian!(autoJac, s, dv, colorvec = colors)

# ╔═╡ 00000000-0000-0000-0000-000000000001
PLUTO_PROJECT_TOML_CONTENTS = """
[deps]
FiniteDiff = "6a86dc24-6348-571c-b903-95158fe2bd41"
IterativeSolvers = "42fd0dbc-a981-5370-80f2-aaf504508153"
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
SparseDiffTools = "47a9eef4-7e08-11e9-0b38-333d64bd3804"
SparsityDetection = "684fba80-ace3-11e9-3d08-3bc7ed6f96df"

[compat]
FiniteDiff = "~2.12.1"
IterativeSolvers = "~0.9.2"
Plots = "~1.29.1"
SparseDiffTools = "~1.23.0"
SparsityDetection = "~0.3.4"
"""

# ╔═╡ Cell order:
# ╠═690f6163-7a99-4b3e-bb70-47775b14dfb4
# ╠═e8b4b6c5-1871-4675-b66a-97008f1b23c8
# ╠═55afd5f5-504c-4574-bd03-58fa9a5e09cc
# ╠═5feaede9-0bd3-4e66-9063-ce828eaf6f7e
# ╠═59a70fac-01e1-429f-81bf-06ab74b1c3a7
# ╠═38b541da-e906-4af4-8736-42e3c65e79cf
# ╠═cd74031b-ebd1-4e18-89d4-e8f68f101a89
# ╠═4b4f9102-ea56-4c92-9655-34771e32a591
# ╠═8494a376-222f-4ca7-abe4-233ec037b3ba
# ╠═e47c9391-0e27-4c54-8c3a-2d58f2935759
# ╠═0234b949-3520-4853-9018-5f3b96c803fd
# ╠═bfb90258-de91-4955-adcf-7c93f79fdefe
# ╠═c5aa54b7-e9bd-45c6-9673-44e9bb215f9a
# ╠═526d04e1-e3d0-4b06-b3b2-d55e2fc73bec
# ╠═e4116e71-d638-424a-a319-0b8cd084bade
# ╠═ab015e88-631d-4cfe-bf09-400bedb25f20
# ╠═d4ab28fb-9124-420d-8f99-3b099c73f6ed
# ╠═d80b1f58-8dd0-4a87-a6a2-3038e4f3da2a
# ╠═867a6b3a-9a6d-4c94-94d9-5a63a642b028
# ╠═f20c670a-9f86-4def-97e1-79e8a2f63bf6
# ╠═3be2e06f-e005-487d-921e-d2d56702599a
# ╠═6741b47d-4513-4dcb-aec8-186c03c24ccc
# ╟─00000000-0000-0000-0000-000000000001
# ╟─00000000-0000-0000-0000-000000000002

gianmariomanca · June 25, 2022, 3:21am

As a side note, in the definition of s(), if I replace the nested loops with the equivalent

m = a - 0.1*sum(a,dims=2) * sum(a,dims=1)

I get an earlier error:

ChrisRackauckas · June 25, 2022, 7:39am

The issue there it looks like you’re using the deprecated SparsityDetection instead of Symbolics, is that on purpose?

gianmariomanca · June 25, 2022, 2:42pm

Not on purpose. I literally just copy pasted the examples from the README.md in GitHub - JuliaDiff/SparseDiffTools.jl: Fast jacobian computation through sparsity exploitation and matrix coloring to get something quick going to see if I could speed up some experiments. This is why I was asking for guidance and examples and I tagged the question also as “documentation”.

gianmariomanca · June 25, 2022, 9:29pm

OK. Using sparsity detection from Symbolic is faster than the one from the deprecated SparsityDetection for the original nested loops s() definition. With Symbolics the error goes away if I use

m = a - 0.1*sum(a,dims=2) * sum(a,dims=1)

instead of the nested loops, but then it just returns an empty (N1N2)x(N1N2) sparse metric

maybe this is expected? Non sure.

In any case, nested loops are fine if needed, my main interest is still to use the sparsity pattern to speed up the auto-diff and use the sparse autodiff Jacobian in a linear system to generate a Gauss-Newton step as described in the original question. A small snippet extracting the Jacobian J and using it in a w = J\v operation (that takes advantage of sparsity) would super helpful as a reference.

mohamed82008 · June 26, 2022, 8:19am

Try the latest NonconvexUtils.jl

using NonconvexUtils, Zygote, ForwardDiff, LinearAlgebra

# Pre-process

m(A) = A - sum(A, dims=2) * sum(A, dims=1)
x = rand(5, 5)
sparse_m = sparsify(m, x)

# Calculate the Jacobian

x = rand(5, 5)
Zygote.pullback(sparse_m, x) # triggers the computation of the Jacobian
J = sparse_m.flat_f.jac

# Test

norm(ForwardDiff.jacobian(x -> vec(m(reshape(x, 5, 5))), vec(x)) - J)

gianmariomanca · June 27, 2022, 6:04pm

Thank you for the suggestion @mohamed82008 . It looks like the Zygote jacobian extraction is orders of magnitude slower than the non sparse ForwardDiff, I’m not sure if that difference can be recovered by a sparse linear solve. I’m thinking about 60x60 matrices, and so 60x60x60x60 jacobians. That’s why I was curious about

forwarddiff_color_jacobian!(autoJac, s, dv, colorvec = colors)

PS I’m assuming this is intrinsic, I don’t see any cashing related improvements if I rerun the Zygote jacobian extraction multiple times, unless I’m missing something obvious.

mohamed82008 · June 27, 2022, 9:06pm

How sparse is your Jacobian matrix? It’s possible my implementation + Zygote are adding some overhead. If the same slowdown persists for different larger matrix sizes, then it’s an intrinsic slowdown because I don’t think the overhead scales with the matrix size. If the slowdown disappears for larger matrices, then we can’t conclude if it’s intrinsic or not for smaller matrices.

Topic		Replies	Views
Module idea for helping calculate large sparse Jacobians Optimization (Mathematical)	5	520	August 25, 2019
How to deal with sparse Jacobians? Optimization (Mathematical)	13	3676	January 13, 2019
Correct use of the sparsity pattern in NLsolve.jl for Diagonal Jacobians Optimization (Mathematical) nlsolve	6	975	July 24, 2020
How to optimize ForwardDiff.jacobian for large system of equations? General Usage	4	1240	August 15, 2019
Sparsity Detection in Jacobian Performance question , diffeq , sparse , symbolics	1	502	February 4, 2022

Sparse jacobians of matrix models

Related topics