I’m really struggling with my ForwardDiff.jl bindings for DifferentiationInterface.jl.
The goal is to achieve an allocation-free Jacobian-vector product (aka pushforward) for functions of the form f!(y, x)
. I have succeeded for pushforward!
but not for value_and_pushforward!
.
To reproduce my issue:
- create a new environment containing Chairmarks.jl and JET.jl
- clone or
dev
DifferentiationInterface.jl into it from the main branch, be sure to use the subdirectory https://github.com/gdalle/DifferentiationInterface.jl/tree/main/DifferentiationInterface - run the following snippet in VSCode (or remove
@profview_allocs
to run it elsewhere) and observe the results
Profiling code
using DifferentiationInterface
using Chairmarks, JET
f!(y, x) = copyto!(y, x)
function profile_pushforward(n)
b = AutoForwardDiff()
x, dx, y, dy = rand(n), rand(n), zeros(n), zeros(n)
extras = prepare_pushforward(f!, y, b, x, dx)
@test_opt pushforward!(f!, y, dy, b, x, dx, extras)
@profview_allocs for _ in 1:100000
pushforward!(f!, y, dy, b, x, dx, extras)
end
@be pushforward!(f!, y, dy, b, x, dx, extras)
end
function profile_value_and_pushforward(n)
b = AutoForwardDiff()
x, dx, y, dy = rand(n), rand(n), zeros(n), zeros(n)
extras = prepare_pushforward(f!, y, b, x, dx)
@test_opt value_and_pushforward!(f!, y, dy, b, x, dx, extras)
@profview_allocs for _ in 1:100000
value_and_pushforward!(f!, y, dy, b, x, dx, extras)
end
@be value_and_pushforward!(f!, y, dy, b, x, dx, extras)
end
profile_pushforward(10)
profile_pushforward(100)
profile_value_and_pushforward(10)
profile_value_and_pushforward(100)
Results
julia> profile_pushforward(10)
Benchmark: 5129 samples with 755 evaluations
min 22.925 ns
median 23.571 ns
mean 23.569 ns
max 65.959 ns
julia> profile_pushforward(100)
Benchmark: 3086 samples with 405 evaluations
min 69.160 ns
median 70.953 ns
mean 74.641 ns
max 291.783 ns
julia> profile_value_and_pushforward(10)
Benchmark: 2756 samples with 149 evaluations
min 192.268 ns (1 allocs: 32 bytes)
median 203.742 ns (1 allocs: 32 bytes)
mean 218.388 ns (1 allocs: 32 bytes)
max 642.477 ns (1 allocs: 32 bytes)
julia> profile_value_and_pushforward(100)
Benchmark: 2769 samples with 95 evaluations
min 296.947 ns (1 allocs: 32 bytes)
median 314.632 ns (1 allocs: 32 bytes)
mean 355.077 ns (1 allocs: 32 bytes, 0.04% gc time)
max 120.325 μs (1 allocs: 32 bytes, 99.15% gc time)
There is one allocation of 32 bytes in value_and_pushforward!
regardless of the size of the input, so I’m thinking type instability. Unfortunately, JET.@test_opt
says it’s all fine, so I’m rather confused.
What’s worse, the allocation disappears if you comment either of the following lines: https://github.com/gdalle/DifferentiationInterface.jl/blob/cc738421b6a8ccd3f0b2a807159ccc78f0c1c151/DifferentiationInterface/ext/DifferentiationInterfaceForwardDiffExt/twoarg.jl#L56-L57. So the type instability cannot be specific to either line, which seems weird to me.
My best guesses:
- Maybe related to no specialization on types, in this case to the tag type
T
- Something something compiler inlining?
Ping @hill in case you have a clue