Problem using ForwardDiff with backslash "\" matrix solver

Hello, I’m trying to use the ForwardDiff.jl package for automatic differentiation of my function, which includes using the backslash \ operator for a matrix solve. ForwardDiff requires Arrays to be of the more general type Real (instead of Float64). However, the \ solver always outputs an Array{Float64}.

Example code:

# input v
v = [1. 2. 3.]

# true X (would be the target in an optimization)
Xtrue = [1.1, 1.9, 0.8]

function f(v)
    # in A*X = B, solve for X

    # A is #3x3 square matrix, with values depending on input v
    A = Matrix{Real}([1. 2. v[1]; 6. v[2] 4.; 8. 7. v[3]]) 
    
    B = Vector{Real}([5., 13., 23.])  #3x1 matrix

    # solve for X using backslash operator
    X = A \ B  #3x1 vector

    # difference between calculated X and true X is the cost
    cost = sum((X - Xtrue).^2) 

    return cost
end

using ForwardDiff
ForwardDiff.gradient(f, v)

ForwardDiff.gradient gives this error:
MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(f), Float64}, Float64, 3})

I think it’s because A \ B outputs an Array{Float64} to X.

Is there a way to make this work?

You’re way overthinking it. Julia will promote the types when you put them in an array. The simplest code works:

using ForwardDiff
# input v
v = [1. 2. 3.]

# true X (would be the target in an optimization)
Xtrue = [1.1, 1.9, 0.8]

function f(v)
    # in A*X = B, solve for X

    # A is #3x3 square matrix, with values depending on input v
    A = [1. 2. v[1]; 6. v[2] 4.; 8. 7. v[3]]

    B = [5., 13., 23.]  #3x1 matrix

    # solve for X using backslash operator
    X = A \ B  #3x1 vector

    # difference between calculated X and true X is the cost
    cost = sum((X - Xtrue).^2)

    return cost
end

using ForwardDiff
ForwardDiff.gradient(f, v)
3 Likes

No, this is not quite right. ForwardDiff requires you to support arbitrary Real types as input. This means that, like any type-generic code, you want to work with whatever number type is passed in the arguments and construct your arrays accordingly. It does not mean to use Matrix{Real}.

(You still want to end up working with arrays of concrete types, which is crucial for performance in Julia, but you have to support ForwardDiff passing in a weird “dual number” type as the input vector.)

For example, this works:

function f(v, xtrue)
    A = [1. 2. v[1]; 6. v[2] 4.; 8. 7. v[3]] # promotes based on type of v
    b = [5., 13., 23.]
    x = A \ b
    return sum(abs2, x - xtrue) # better yet, use norm(x - xtrue) from LinearAlgebra
end

v = [1. 2. 3.]
xtrue = [1.1, 1.9, 0.8]

import ForwardDiff
ForwardDiff.gradient(v -> f(v, xtrue), v)

(Note also that I edited f to no longer use a global variable Xtrue, which is good to avoid on general software-engineering principles but also particularly in Julia.)

5 Likes

Thanks for your help. Unfortunately, it turns out that I asked a bad question any that my example didn’t get to the crux of my problem -_-

In reality, I need to build matrices A and B line-by-line in a loop, and thus need to pre-allocate them, which I believe forces me to define their type.

function f(v, xtrue)
    A = Matrix{Float64}(undef, 3, 3)

    # A values assigned row-by-row
    for i in 1:size(A, 1)
        if i % 2 == 0  # is even
            A[i, :] = [v[i] 1. 2.]
        else
            A[i, :] = [5. 6. v[i]]
        end
    end

    b = [5., 13., 23.]
    x = A \ b
    return sum(abs2, x - xtrue)
end

v = [1. 2. 3.]
xtrue = [1.1, 1.9, 0.8]

import ForwardDiff
ForwardDiff.gradient(v -> f(v, xtrue), v)

I just wrote a silly example of loop assignment here – in reality the conditional is on another variable, not whether i is odd and even.

I then get error from FowardDiff, MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#13#14", Float64}, Float64, 3}) .

It’s as if pre-allocating A forces x into Float64 and prevents its promotion? Is there an easy way around this?

You need to allocate A in such a way that it can hold elements of typeof(v), for example something like this

function f(v::AbstractVector{T}, xtrue) where T
    A = Matrix{T}(undef, 3, 3)
    # ...
end
2 Likes