What is the right way to compute Gradient of function of 2 variables with Auto Differentiation?

Andrey.Borzunov · September 18, 2018, 5:51pm

I’m terribly confused with number of packages that provide autodiff functionalities and it’s peculiarity.

I’m required to compute gradient of multivariable function (e.g. f(x,y), where x,y are Numbers).
I found that AutoDiffSource and ReverseDiffSource are not supported.

I succeeded to do it with ReverseDiff.jl package in the next way:

f(a, b) = (1-a).^2 + 100*(a-b.^2).^2;
using ReverseDiff
g = (x,y) -> ReverseDiff.gradient(f, (x,y))
g([1],[1])

But it looks queer to me.

I have to pass explicit Arrays, and write .* operations in function definition. Why?

I succeeded to compute it with ForwardDiff.jl with derivative but not gradient.

g(a, b) = (1-a)^2 + 100*(a-b^2)^2
dfdx = x -> ForwardDiff.derivative(y -> g(x,y), x)
dfdy = y -> ForwardDiff.derivative(x -> g(x,y), y)

@show dfdx(1)
@show dfdy(2)

Is this two solutions are equivalent?

What is the right way to solve this problem?

Thx!

kristoffer.carlsson · September 18, 2018, 7:24pm

What derivatives do you want? Do you mean:

julia> g(a, b) = (1-a)^2 + 100*(a-b^2)^2
g (generic function with 1 method)

julia> ForwardDiff.gradient(z -> g(z[1], z[2]), [1.0, 2.0])
2-element Array{Float64,1}:
 -600.0
 2400.0

Giving you [df/da, df/db] at a = 1.0, b = 2.0

Andrey.Borzunov · September 18, 2018, 7:37pm

I want grad_f(x,y) which can evaluate gradient at arbitrary point (x,y).

kristoffer.carlsson · September 18, 2018, 7:43pm

julia> grad(x, y) = ForwardDiff.gradient(z -> g(z[1], z[2]), [x, y])
grad (generic function with 1 method)

julia> grad(2,3)
2-element Array{Int64,1}:
 -1398
  8400

Points are typically defined in Julia using Vectors (or for small dimensions StaticVectors). So writing it something like

julia> g(x) = (1-x[1])^2 + 100*(x[1]-x[2]^2)^2 # could of course unpack x into a and b here
g (generic function with 2 methods)

julia> grad(x) = ForwardDiff.gradient(g, x)
grad (generic function with 2 methods)

julia> grad([2,3])
2-element Array{Int64,1}:
 -1398
  8400

julia> using StaticArrays # good for small dim vectors

julia> grad(SVector(2,3))
2-element SArray{Tuple{2},Int64,1,2}:
 -1398
  8400

might be a bit more natural.

Also, note:

julia> using BenchmarkTools

julia> @btime grad($([2,3]));
  4.436 μs (4 allocations: 304 bytes)

julia> @btime grad($(SVector(2,3)));
  2.089 ns (0 allocations: 0 bytes)

which is a 2000x perf difference (which of course is due to how simple g is in this case)

Andrey.Borzunov · September 18, 2018, 7:54pm

Ok, looks like this is what I’m looking for.

Now, I’m chasing only convenient syntax for me(I will abandon this habit).

We can’t pass tuple to ForwardDiff like this

∇g(x,y) = ForwardDiff.gradient((x,y) -> g(z[1], z[2]), (x,y))

hence we can only use syntax where points are represented as vectors, not tuples.

kristoffer.carlsson · September 18, 2018, 7:59pm

Yes, for tuples you would convert it to an Array (or for better performance, StaticArray)

julia> ∇g(x,y) = ForwardDiff.gradient(z -> g(z[1], z[2]), SVector(x,y))
∇g (generic function with 1 method)

julia> ∇g(1,2)
2-element SArray{Tuple{2},Int64,1,2}:
 -600
 2400

Andrey.Borzunov · September 18, 2018, 8:03pm

Oh, perfect. Thank you a lot.

Topic		Replies	Views
Automatic differentiation in Julia for gradient computation General Usage autodiff	1	695	September 8, 2023
Automatic differentiation - ForwardDiff: components of gradient General Usage question , autodiff	1	724	September 21, 2021
"static" autodiff New to Julia	13	1387	February 10, 2023
Gradient of a normal (NOT vector) function that has vector arguments General Usage	10	1108	July 30, 2019
Gradients of a simple function using ForwardDiff yields errorenous results? - Or am I doing something wrong? New to Julia package	2	339	November 12, 2020

What is the right way to compute Gradient of function of 2 variables with Auto Differentiation?

Related topics