Best way to perform AD of an array

I have a function that defines some 2D array. A toy example might be something like:

function create_array(coordinates,a)

    g = zeros(typeof(a),4,4)

    g[1,1] = coordinates[1]^2 + coordinates[2] + 3
    g[2,2] = coordinates[2]^2 + coordinates[3] + 4
    return g

end

I want to be able to compute the derivative of the array with respect to the coordinates variables, but not the argument a, which is a constant. There are 4 coordinate variables.

What is the best way to do this using an AD library?

The approach I have used initially is to define a pure function for each of the elements of the array, and to differentiate that e.g.

function array_term_g11(coordinates,a)
    return  coordinates[1]^2 + coordinates[2] + 3 
end 

Zygote.jacobian(x -> array_term_g11(x,a), [t r θ ϕ])

However this is obviously verbose and clunky, and it seems like Enzyme is now preferred over Zygote. However it is not obvious to me from reading the docs how to implement this derivative of an array w.r.t a vector input in Enzyme?

Any guidance much appreciated. Thanks

Perhaps something like this (if you wanted chunk size = 1, and forward mode as the internal routine)?

	using Enzyme

	function create_array(coordinates,a)

           g = zeros(typeof(a),4,4)

           g[1,1] = coordinates[1]^2 + coordinates[2] + 3
           g[2,2] = coordinates[2]^2 + coordinates[3] + 4
           return g

    end

	t=1.0
	r=2.0
	θ=3.0
	ϕ=4.0

	a = 5.0
	x = [t r θ ϕ]
	shadow = onehot(x)
    jacobian = ntuple(length(shadow)) do i
        autodiff(Forward, create_array, DuplicatedNoNeed, Duplicated(x, shadow[i]), Const(a))[1]
    end

    # ([2.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0], [1.0 0.0 0.0 0.0; 0.0 4.0 0.0 0.0; 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0], [0.0 0.0 0.0 0.0; 0.0 1.0 0.0 0.0; 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0], [0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0])
1 Like

As a follow up, how would I extend this to get second derivatives i.e. the Hessian? Is this actually possible with Enzyme - I only see references to jacobians in the docs…?

Yes. In essence you would call autodiff on autodiff.

Hmm, so using the same structure and looking at the examples I would expect to get the first row of the Hessian via something like

i = 1
x = [t r θ ϕ]
shadow = onehot(x)
Enzyme.autodiff(
    Forward,
    x -> Enzyme.autodiff_deferred(Reverse,create_array, x,Const(a)),
    DuplicatedNoNeed,
    Duplicated(x, shadow[i]),Const(a)
)

This doesn’t work - clearly I am misunderstanding something! For this case, how does one call autodiff on autodiff?

On my phone, but one thing is that your inner closure function only takes one arg (x), instead of two (x,a) → …

Also any arguments that you don’t pass an activity tuple (like x on the inner autodiff), are assumed constant — which means the inner AD doesn’t do what you expect.

I’ll write a snippet tomorrow after flying out from EnzymeCon

So if I consider just the autodiff_deferred() function, I would want to write something like

Enzyme.autodiff_deferred(create_array,Active,Duplicated(x, shadow[i]),Const(a))

The activity here needs to be Active since Duplicated returns are not handled. But then looking at the src on L392 we call

one(eltype(rt))

where rt is the return type which in this case is Matrix{Float64}. However one() only has methods for unions?