I’m terribly confused with number of packages that provide autodiff functionalities and it’s peculiarity.
I’m required to compute gradient of multivariable function (e.g.
x,y are Numbers).
I found that AutoDiffSource and ReverseDiffSource are not supported.
I succeeded to do it with ReverseDiff.jl package in the next way:
f(a, b) = (1-a).^2 + 100*(a-b.^2).^2;
g = (x,y) -> ReverseDiff.gradient(f, (x,y))
But it looks queer to me.
I have to pass explicit Arrays, and write
.* operations in function definition. Why?
I succeeded to compute it with ForwardDiff.jl with derivative but not gradient.
g(a, b) = (1-a)^2 + 100*(a-b^2)^2
dfdx = x -> ForwardDiff.derivative(y -> g(x,y), x)
dfdy = y -> ForwardDiff.derivative(x -> g(x,y), y)
Is this two solutions are equivalent?
What is the right way to solve this problem?
What derivatives do you want? Do you mean:
julia> g(a, b) = (1-a)^2 + 100*(a-b^2)^2
g (generic function with 1 method)
julia> ForwardDiff.gradient(z -> g(z, z), [1.0, 2.0])
[df/da, df/db] at
a = 1.0, b = 2.0
grad_f(x,y) which can evaluate gradient at arbitrary point
julia> grad(x, y) = ForwardDiff.gradient(z -> g(z, z), [x, y])
grad (generic function with 1 method)
Points are typically defined in Julia using Vectors (or for small dimensions StaticVectors). So writing it something like
julia> g(x) = (1-x)^2 + 100*(x-x^2)^2 # could of course unpack x into a and b here
g (generic function with 2 methods)
julia> grad(x) = ForwardDiff.gradient(g, x)
grad (generic function with 2 methods)
julia> using StaticArrays # good for small dim vectors
might be a bit more natural.
julia> using BenchmarkTools
julia> @btime grad($([2,3]));
4.436 μs (4 allocations: 304 bytes)
julia> @btime grad($(SVector(2,3)));
2.089 ns (0 allocations: 0 bytes)
which is a 2000x perf difference (which of course is due to how simple g is in this case)
Ok, looks like this is what I’m looking for.
Now, I’m chasing only convenient syntax for me(I will abandon this habit).
We can’t pass tuple to ForwardDiff like this
∇g(x,y) = ForwardDiff.gradient((x,y) -> g(z, z), (x,y))
hence we can only use syntax where points are represented as vectors, not tuples.
Yes, for tuples you would convert it to an Array (or for better performance, StaticArray)
julia> ∇g(x,y) = ForwardDiff.gradient(z -> g(z, z), SVector(x,y))
∇g (generic function with 1 method)
Oh, perfect. Thank you a lot.