I would like to calculate the gradient of a simple function using FowardDiff.gradient, but I have been encountering problems, where I have gotten different answers from code that should seemingly be doing the same thing.
Since what I am trying to do is rather basic (calculating a gradient of a function) and I wouldn’t expect the ForwardDiff package to mess up with this, I am expecting the problem to be on my end
I have reduced my code to a MWE in the following way. I realize what the code does now doesn’t make much sense, but it illustrates the problem that I am having:
Start minimum working example:
using ForwardDiff
function f1(x)
z = x
gammax = x
gammaz = z
return sum(x.*(log.(x) .+ log.(gammax) .- log.(z) .- log.(gammaz)))
end
function f2(x)
z = [0.1 0.9]
gammax = x
gammaz = z
return sum(x.*(log.(x) .+ log.(gammax) .- log.(z) .- log.(gammaz)))
end
g1 = x -> ForwardDiff.gradient(f1,x)
g2 = x -> ForwardDiff.gradient(f2,x)
fval1 = f1([0.1 0.9])
fval2 = f2([0.1 0.9])
gval1 = g1([0.1 0.9])
gval2 = g2([0.1 0.9])
Expected answers:
In function f1 the terms in the brackets involving logarithms are expected to be zero, regardless of what x is. The same is true if I am giving [0.1 0.9] as input argument to f2. Therefore, the function values fval1 and fval2 are expected to be zero and so are the gradients.
Outputs:
fval1
0.0
fval2
0.0
gval1
1×2 Array{Float64,2}:
0.0 0.0
gval2
1×2 Array{Float64,2}:
2.0 2.0
→ fval1, fval2, gval1 are all as expected. gval2 seems wrong?
Can anyone help me in understanding this? It’s completely puzzling to me