TLDR:
prep calls | exec calls | |
---|---|---|
case 1 | 1 output shape + 1 sparsity detection | 1 pushforward |
case 2 | 1 output shape + 1 sparsity detection | 1 pushforward |
case 3 | 1 sparsity detection | 1 pushforward |
case 4 | 1 sparsity detection | 1 pushforward + 2 pullbacks |
case 4bis | 1 sparsity detection | 1 pushforward + 1 pullback |
The rationale is that for out-of-place functions, I perform one additional function call during preparation to generate a typical output y
and get its type and shape (here). Strictly speaking, this is not necessary for forward mode, but it is necessary for reverse mode. Right now, the two modes share a lot of code, which is why I wrote it that way.
The whole mindset of DI is that the performance-critical stuff happens once preparation has been executed, and this is especially important in the sparse case. Thus, saving one function call during preparation never was a priority (but if you have a use case that justifies it, we can talk about it).
This happens because mixed-mode sparse autodiff corresponds to a bidirectional coloring algorithm, for which some columns or rows get the color 0 (aka no color) when they are useless during recovery. This wasn’t initially possible in SparseMatrixColorings.jl, and so we deactivated it by default, hiding it behind GreedyColoringAlgorithm(; postprocessing=true)
. I opened this PR to add a note to the DI docs.
When toggling the postprocessing
option (case 4bis), I get the right number of calls in case 4:
julia> sparse_mixed_backend = AutoSparse(
MixedMode(AutoForwardDiff(), AutoReverseDiff());
sparsity_detector=SparseConnectivityTracer.TracerSparsityDetector(),
coloring_algorithm=SparseMatrixColorings.GreedyColoringAlgorithm(; postprocessing=true),
);
julia> prepJ4 = prepare_jacobian(objcon, y4, sparse_mixed_backend, x);
In objcon
julia> column_colors(prepJ4)
2-element Vector{Int64}:
1
1
julia> row_colors(prepJ4) # the 0's were 2's without postprocessing
4-element Vector{Int64}:
1
0
0
0
julia> J4 = jacobian(objcon, y4, prepJ4, sparse_mixed_backend, x)
In objcon
In objcon
Keep in mind however that the number of colors is not guaranteed to be optimal, and that in general random vertex ordering may (or may not) help.
As a side note, you can use DifferentiationInterfaceTest.jl to perform this kind of benchmarking, with function call count too! Just remember to deactivate printing first.
using DifferentiationInterfaceTest
function MyAutoSparse(backend)
return AutoSparse(
backend;
sparsity_detector=TracerSparsityDetector(),
coloring_algorithm=GreedyColoringAlgorithm(),
)
end
backends = [
MyAutoSparse(AutoForwardDiff(; chunksize=1)),
MyAutoSparse(AutoReverseDiff())
]
scens = [
Scenario{:jacobian,:in}(sparse_obj_oop, x; res1=J1, name="outofplace"),
Scenario{:jacobian,:in}(sparse_obj, zeros(3), x; res1=J2, name="inplace"),
]
data = benchmark_differentiation(backends, scens; benchmark=:full, logging=true)
The output is a nicely formatted DataFrame
(see this tutorial) for an example).