Hello, I’m trying to use the ForwardDiff.jl package for automatic differentiation of my function, which includes using the backslash \
operator for a matrix solve. ForwardDiff requires Arrays to be of the more general type Real (instead of Float64). However, the \
solver always outputs an Array{Float64}.
Example code:
# input v
v = [1. 2. 3.]
# true X (would be the target in an optimization)
Xtrue = [1.1, 1.9, 0.8]
function f(v)
# in A*X = B, solve for X
# A is #3x3 square matrix, with values depending on input v
A = Matrix{Real}([1. 2. v[1]; 6. v[2] 4.; 8. 7. v[3]])
B = Vector{Real}([5., 13., 23.]) #3x1 matrix
# solve for X using backslash operator
X = A \ B #3x1 vector
# difference between calculated X and true X is the cost
cost = sum((X - Xtrue).^2)
return cost
end
using ForwardDiff
ForwardDiff.gradient(f, v)
ForwardDiff.gradient gives this error:
MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(f), Float64}, Float64, 3})
I think it’s because A \ B outputs an Array{Float64} to X.
Is there a way to make this work?
You’re way overthinking it. Julia will promote the types when you put them in an array. The simplest code works:
using ForwardDiff
# input v
v = [1. 2. 3.]
# true X (would be the target in an optimization)
Xtrue = [1.1, 1.9, 0.8]
function f(v)
# in A*X = B, solve for X
# A is #3x3 square matrix, with values depending on input v
A = [1. 2. v[1]; 6. v[2] 4.; 8. 7. v[3]]
B = [5., 13., 23.] #3x1 matrix
# solve for X using backslash operator
X = A \ B #3x1 vector
# difference between calculated X and true X is the cost
cost = sum((X - Xtrue).^2)
return cost
end
using ForwardDiff
ForwardDiff.gradient(f, v)
3 Likes
No, this is not quite right. ForwardDiff requires you to support arbitrary Real
types as input. This means that, like any type-generic code, you want to work with whatever number type is passed in the arguments and construct your arrays accordingly. It does not mean to use Matrix{Real}
.
(You still want to end up working with arrays of concrete types, which is crucial for performance in Julia, but you have to support ForwardDiff passing in a weird “dual number” type as the input vector.)
For example, this works:
function f(v, xtrue)
A = [1. 2. v[1]; 6. v[2] 4.; 8. 7. v[3]] # promotes based on type of v
b = [5., 13., 23.]
x = A \ b
return sum(abs2, x - xtrue) # better yet, use norm(x - xtrue) from LinearAlgebra
end
v = [1. 2. 3.]
xtrue = [1.1, 1.9, 0.8]
import ForwardDiff
ForwardDiff.gradient(v -> f(v, xtrue), v)
(Note also that I edited f
to no longer use a global variable Xtrue
, which is good to avoid on general software-engineering principles but also particularly in Julia.)
5 Likes
Thanks for your help. Unfortunately, it turns out that I asked a bad question any that my example didn’t get to the crux of my problem -_-
In reality, I need to build matrices A and B line-by-line in a loop, and thus need to pre-allocate them, which I believe forces me to define their type.
function f(v, xtrue)
A = Matrix{Float64}(undef, 3, 3)
# A values assigned row-by-row
for i in 1:size(A, 1)
if i % 2 == 0 # is even
A[i, :] = [v[i] 1. 2.]
else
A[i, :] = [5. 6. v[i]]
end
end
b = [5., 13., 23.]
x = A \ b
return sum(abs2, x - xtrue)
end
v = [1. 2. 3.]
xtrue = [1.1, 1.9, 0.8]
import ForwardDiff
ForwardDiff.gradient(v -> f(v, xtrue), v)
I just wrote a silly example of loop assignment here – in reality the conditional is on another variable, not whether i is odd and even.
I then get error from FowardDiff, MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#13#14", Float64}, Float64, 3})
.
It’s as if pre-allocating A forces x into Float64 and prevents its promotion? Is there an easy way around this?
You need to allocate A
in such a way that it can hold elements of typeof(v)
, for example something like this
function f(v::AbstractVector{T}, xtrue) where T
A = Matrix{T}(undef, 3, 3)
# ...
end
2 Likes