Iām learning how to use the FiniteDifference.jl package without allocating, and I want to implement a function for doing Jacobian vector products inplace.

The goal is to accomplish the same as `jvp_finite_difference!`

, but without as much numerical errors.

```
# Function to be differentiated, sum(x) .* x.^2
function f!(out, x)
broadcast!(^, out, x, 2)
broadcast!(*, out, out, sum(x))
end
function jvp_finite_difference!(out, f!, y, z, x, v, epsilon=1e-8)
broadcast!(*, z, v, epsilon)
broadcast!(+, z, z, x)
f!(y, x)
f!(out, z)
broadcast!(-, out, out, y)
broadcast!(/, out, out, epsilon)
end
# This works
x = randn(10); v = copy(x); out = copy(x); y = copy(x); z = copy(x)
jvp_finite_difference!(out, f!, y, z, x, v)
using ForwardDiff
# Implementation using ForwardDiff
function jvp!(out, f!, y, z, x, v)
g!(y, t) = begin
broadcast!(*, z, v, t)
broadcast!(+, z, z, x)
f!(y, z)
end
ForwardDiff.derivative!(out, g!, y, 0)
end
# This fails with the error below
jvp!(out, f!, y, z, x, v)
```

```
ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#g!#63"{typeof(f!),Array{Float64,1},Array{Float64,1},Array{Float64,1}},Int64},Float64,1}
```

It clearly says in the FiniteDiff.jl docs that any storage used inside functions must be generic enough, so i guess this is the problem.

Using really generic storage works, but makes the performance drop

```
# This works, but slow
x = ones(10); v = ones(10); result = ones(10); y = ones(Real, 10); z = copy(y)
jvp!(out, f!, y, z, x, v)
```

What am I missing here? Is there a less generic, more performant, type of storage I should use that still works with ForwardDiff.jl? Or something else?

Very thankful for help and suggestions!