I think you can use something like
function vjp(f::Function, x::AbstractVector, v::AbstractVector)
g(y) = dot(v, f(y))
ForwardDiff.gradient(g, x)
end
Though reverse mode autodiff would be much more efficient unless your vector x
is short.
[Edit: Although it looks nice, I think this formulation amounts to computing the full Jacobian internally (possibly in batches when x
is long… but nevertheless).]