Mutating array workaround for reverse AD?

Hi,

I have yet another unsolved problem for reverse mode AD (or I think, this is the same problem as the one I posted previously: LoadError: ArgumentError: Converting an instance of ReverseDiff.TrackedReal{Float64, Float64, Nothing} to Float64 is not defined. Please use `ReverseDiff.value` instead). Here is the (hopefully) MWE:

using LinearAlgebra
using Zygote, ReverseDiff

function f_test_1(A, x)
    """
    u = Ax + some constant row vector
    """
    u = A*x[2:end] .+ x[1]
    return u
end

function f_test_2(A, x)
    """
    alloc inside
    """
    u = Vector{Float64}(undef, length(x)-1)
    u .= A*x[2:end] .+ x[1]
    return u
end

function f_test_3!(u, A, x)
    """
    non-alloc ver
    """
    u .= A*x[2:end] .+ x[1]
end

# derivative:
J_f_1(A, x) = Zygote.jacobian(θ -> f_test_1(A, θ), x)
J_f_2(A, x) = Zygote.jacobian(θ -> f_test_2(A, θ), x)
J_f_3(u, A, x) = Zygote.jacobian(θ -> f_test_3!(u, A, θ), x)

J_f_1 works, however _2 and _3 don’t, for example:

A = Matrix{Float64}(LinearAlgebra.I, 5, 5)
x = ones(6)
u = Vector{Float64}(undef, 5)

J_f_1(A, x) # this works
J_f_2(A, x) # throws "Mutating arrays is not supported -- called copyto!(::Vector{Float64}, _...)"
J_f_3(u,A,x) # same as _2

Also, replacing Zygote with ReverseDiff produces similar error it seems, although the error trace shows different thing (ReverseDiff gives “ArgumentError: Converting an instance of ReverseDiff.TrackedReal{Float64, Float64, Nothing} to Float64 is not defined. Please use ReverseDiff.value instead.”, I tried using ReverseDiff.value, however it results in incorrect Jacobian (all 0.)).
So, are there any workarounds for Zygote or ReverseDiff to work for mutating arrays (other than using f_test_1)? For example if I have more complicated functions which require assignments of values to the output array(s), like B[i,:,:] = … .
Thanks.

You could write your own rrule (= vector–Jacobian product) for the routines that need mutation.

I pretty much assume that for any sufficiently complicated calculation, with any AD software in any language, that I will have to write at least some manual derivatives (see also section 5 of these notes).

2 Likes

Just as a quick note, docstrings in Julia go above the function not below it, right now you are creating a string inside the function :slight_smile:

Enzyme.jl does support in-place AD, but it’s BLAS support is nascent. I took your code and opened an issue so that we can make sure to add support for the necessary BLAS funcion https://github.com/EnzymeAD/Enzyme.jl/issues/312

You could write your own rrule (= vector–Jacobian product) for the routines that need mutation.

Thanks, although currently I still haven’t fully figure out yet on how to write the rrule such that it could make reverse AD for mutating array works (my understanding of rrule is still lacking).

Just as a quick note, docstrings in Julia go above the function not below it, right now you are creating a string inside the function :slight_smile:

I see, thanks for pointing it out!

Enzyme.jl does support in-place AD, but it’s BLAS support is nascent

Ok, I will check it out.

For ReverseDiff, just do identity.(x) to the TrackedArray to get an Array{TrackedReal} and then make the buffers match that type. That will then reverse scalar-wise, in which case you want to use the compiled mode to accelerate it. Note that this looks like a PDE, and DifferentialEquations.jl automates this.