Hi,
I am trying to use ForwardDiff to compute the jacobian of a system of equations that I anticipate will be very large. Below is a very simplified working code that illustrate what I am trying to do.
When I run the code below I get the following as output from @time
:
20.089338 seconds (128.02 k allocations: 46.572 GiB, 4.09% gc time)
Since I am new in Julia, it is likely that I am doing something wrong. Could someone help with the following questions:
-
Would you expect that ForwardDiff.jacobian
takes 20s to for the case of 50K equations ?
-
Could you let me know if there is anything that I could do to optimize the code below ?
Thanks in advance for the help.
using ForwardDiff
using ForwardDiff: Chunk
function ResidualEquations( X::AbstractArray )
1*X
end
p0 = rand(50000)
f(x) = ResidualEquations(x)
println("\n \t ... Creating Config ... ")
jconfig = ForwardDiff.JacobianConfig( f , p0, Chunk{2}())
println("\n \t ... Evaluating jacobian ... ")
@time jac = ForwardDiff.jacobian( f, p0, jconfig )::Array{Float64,2}
Thanks for reply. I think my Jacobian is sparse, but it is not clear from your answer how using SparseDiffTools
could help ? Could you please provide an example ? Thanks in advance !
Use GitHub - SciML/SparsityDetection.jl: Automatic detection of sparsity in pure Julia functions for sparsity-enabled scientific machine learning (SciML) to get the sparsity pattern, and then use https://github.com/JuliaDiffEq/SparseDiffTools.jl for matrix coloring to get a color vector, and use SparseDiffTools.jl again for the forward color AD. See the SparseDiffTools.jl README for an example.
2 Likes
I’m not sure I can help but I’ll chime in.
You should definitely figure out if it is sparse. Are most \frac{\partial f_i(x)}{\partial x_j} entries of the Jacobian equal to zero? If yes, then you should think about better strategies than just calculating all the entries, which I think is what ForwardDiff does in your example (which is not very smart because in your example the Jacobian is simply J(x) = I
).
If you know it is sparse, I would also suggest decomposing your function into simpler pieces if possible, and think about what the Jacobian of each piece is. For example,
- if
f(x) = M * x
, then the Jacobian is J(x) = M
so you should already have it,
- if
f
is nonlinear but local, i.e., f(x) = g.(x)
for some nonlinear scalar function g
, then J(x) = Diagonal(Dg.(x))
where Dg(x) = ForwardDiff(g, x)
, which should cost you the equivalent of a single call to f
,
- if
f
is something like f(x) = M * g.(x)
, then J(x) = M * Diagonal(Dg.(x))
should also cost you the equivalent of 1 call to f
,
- and so on…
In any case if f
is sparse, then you should follow @matthieu’s advice and use SparseDiffTools to at least reduce the number of calls to f
. I believe there is enough info on the ReadMe to get you started there.
1 Like