After reading this I got the impression that computations with dual numbers are typically expected to be twice as slow as non-duals. The reasoning was that both the normal and dual computations have to calculated at each operation, much like in complex arithmetic.

However I fail to observe this in the following minimal example. The example is motivated by the computation of parts of the gradient of a function `g(x::AbstractVector)`

. Assuming forward mode AD, I can either split the function’s input into two vectors (one operating with dual numbers and the other not) or use dual numbers for everything. In doing the latter I don’t observe the nearly as much as `(2*5 + 2*20)/(2* 5 + 20)`

= 1.667x slowdown that I would expect.

```
using ForwardDiff
using BenchmarkTools
using Base: Iterators
function f(x::AbstractVector{T}, y::AbstractVector) where {T}
result = zero(T)
for j in 1:1000 # 1000 "computations" to simulate a situation where f or g dominates allocation costs in override
for value in x
result += (0.9 * value)^j
end
for value in y
result += (0.9 * value)^j
end
end
return result
end
const x_values = ones(5)
const y_values = ones(20)
@btime ForwardDiff.gradient(x -> f(x, y_values), x_values)
const combined_values = vcat(x_values, y_values)
function g(combined::AbstractVector{T}) where {T}
result = zero(T)
for j in 1:1000 # 1000 "computations" to simulate a situation where f or g dominates allocation costs in override
for value in combined
result += (0.9 * value)^j
end
end
return result
end
function override(x, y::T) where {T}
result = T(x)
for i in 1:length(y)
result[i] = y[i]
end
return result
end
override(combined_values, x_values)
@btime ForwardDiff.gradient(x -> g(override(combined_values, x)), x_values)
```

Which gives:

```
587.081 μs (4 allocations: 848 bytes)
657.692 μs (5 allocations: 2.16 KiB)
```