Array addition in Enzyme

There are a couple of issues in here, but they’re related in how I discovered them.

In attempting to test Enzyme on some basic vector operations, I came across a problem with derivatives of the addition function. In particular, this example with LabelledArrays doesn’t work because there is no one method for SLArray:

using Random, LabelledArrays, Enzyme

SLV = @SLVector Float64 (:a, :b, :c);

u = SLV(rand(3));
v = SLV(rand(3));
du = zero(u);
dv = zero(v);

@show autodiff(Reverse, +, Active, Active(u), Active(v))


ERROR: LoadError: MethodError: no method matching one(::Type{SLArray{Tuple{3}, Float64, 1, 3, (:a, :b, :c)}})

Closest candidates are:
  one(::Type{Union{}}, Any...)
   @ Base number.jl:349
   @ Base missing.jl:104
   @ Base bitarray.jl:426

 [1] autodiff(::ReverseMode{false, FFIABI}, ::Const{typeof(+)}, ::Type{Active}, ::Active{SLArray{Tuple{3}, Float64, 1, 3, (:a, :b, :c)}}, ::Vararg{Active{SLArray{Tuple{3}, Float64, 1, 3, (:a, :b, :c)}}})
   @ Enzyme ~/.julia/packages/Enzyme/jOGYG/src/Enzyme.jl:213
 [2] autodiff(::ReverseMode{false, FFIABI}, ::typeof(+), ::Type, ::Active{SLArray{Tuple{3}, Float64, 1, 3, (:a, :b, :c)}}, ::Vararg{Active{SLArray{Tuple{3}, Float64, 1, 3, (:a, :b, :c)}}})
   @ Enzyme ~/.julia/packages/Enzyme/jOGYG/src/Enzyme.jl:224
 [3] macro expansion
   @ show.jl:1181 [inlined]
 [4] top-level scope
   @ ~/software/julia/enzyme/slarray-one-mwe.jl:10
 [5] include(fname::String)
   @ Base.MainInclude ./client.jl:489
 [6] top-level scope
   @ REPL[8]:1

I understand why you might not have a method for one because the multiplicative identity is a different type than the input (matrix vs vector, e.g.). It made me think of Vector, which also doesn’t have a one method. It looks like enzyme doesn’t handle vector addition in the way I would expect:

using Random, Enzyme

x = rand(3); y = rand(3); dx = zero(x); dy = zero(y)

autodiff(Reverse, +, Active, Duplicated(x, dx), Duplicated(y, dy))
ERROR: Enzyme Mutability Error: Cannot add one in place to immutable value [0.0, 0.0, 0.0]
 [1] add_one_in_place
   @ ~/.julia/packages/Enzyme/jOGYG/src/compiler.jl:4979 [inlined]
 [2] augmented_julia___2562wrap
   @ ./arraymath.jl:0
 [3] macro expansion
   @ Enzyme.Compiler ~/.julia/packages/Enzyme/jOGYG/src/compiler.jl:5306 [inlined]
 [4] enzyme_call(::Val{false}, ::Ptr{Nothing}, ::Type{Enzyme.Compiler.AugmentedForwardThunk}, ::Type{Val{1}}, ::Val{false}, ::Type{Tuple{Duplicated{Vector{Float64}}, Duplicated{Vector{Float64}}}}, ::Type{Duplicated{Vector{Float64}}}, ::Const{typeof(+)}, ::Type{@NamedTuple{1, 2, 3, 4, 5, 6::Bool, 7::Bool, 8::Bool, 9, 10::Bool, 11::UInt64, 12::UInt64}}, ::Duplicated{Vector{Float64}}, ::Duplicated{Vector{Float64}})
   @ Enzyme.Compiler ~/.julia/packages/Enzyme/jOGYG/src/compiler.jl:4984
 [5] (::Enzyme.Compiler.AugmentedForwardThunk{Ptr{Nothing}, Const{typeof(+)}, Duplicated{Vector{Float64}}, Tuple{Duplicated{Vector{Float64}}, Duplicated{Vector{Float64}}}, Val{1}, Val{false}(), @NamedTuple{1, 2, 3, 4, 5, 6::Bool, 7::Bool, 8::Bool, 9, 10::Bool, 11::UInt64, 12::UInt64}})(::Const{typeof(+)}, ::Duplicated{Vector{Float64}}, ::Vararg{Duplicated{Vector{Float64}}})
   @ Enzyme.Compiler ~/.julia/packages/Enzyme/jOGYG/src/compiler.jl:4937
 [6] autodiff(::ReverseMode{false, FFIABI}, ::Const{typeof(+)}, ::Type{Active}, ::Duplicated{Vector{Float64}}, ::Vararg{Duplicated{Vector{Float64}}})
   @ Enzyme ~/.julia/packages/Enzyme/jOGYG/src/Enzyme.jl:198
 [7] autodiff(::ReverseMode{false, FFIABI}, ::typeof(+), ::Type, ::Duplicated{Vector{Float64}}, ::Vararg{Duplicated{Vector{Float64}}})
   @ Enzyme ~/.julia/packages/Enzyme/jOGYG/src/Enzyme.jl:224
 [8] top-level scope
   @ REPL[6]:1

I would expect the answer to be that dx = ones(3). Is this unsupported by Enzyme currently (but with plans to support it in the future)? Is this custom rules territory?

Reverse-mode in Enzyme doesn’t support differentiating wrt non-scalar returns atm. Your code returns an array (z = x + y).

This is because Enzyme needs to know what to seed the derivative return value (by default one, hence the thing you saw earlier). Thus it is ambiguous if you want to differentiate wrt the return at z[1], z[2], etc.

There are a lot of ways you can resolve this, depending on what you want to do. If you want it to return dx = all 1’s, you probably want to differentiate with respect to the sum all of the vars. That is to say do autodiff(Reverse, (x,y)->sum(x+y), Duplicated(x, dx), Duplicated(y, dy)). Technically this will compute dz[1]/dx + dz[2]/dx + ... .

If you care about the derivative wrt a specific seeding, you can do this by providing the return array.

function f(out, x, y)
  for i in 1:length(x)
    out[i] = x[i] + y[i]

x = rand(3); y = rand(3); dx = zero(x); dy = zero(y)
out = Vector{Float64}(undef, 3)
dout = ones(3)

autodiff(Reverse, f, Const, Duplicated(out, dout), Duplicated(x, dx), Duplicated(y, dy))

Technically this is more general and will actually compute dout[1] * dz[1]/dx + dout[2] * dz[2]/dx + ... . Thus if you wanted the derivative wrt the first output, set dout = [1, 0, 0]. If you want the derivative wrt the first output plus two times the derivative wrt the second output you could set dout = [1, 2, 0], etc.

Alternatively duplicated (aka non-scalar) returns are permissible in split reverse mode, like following.

fwd, rev = autodiff_thunk(ReverseSplitWithPrimal, Const{typeof(+)}, Duplicated, Duplicated{typeof(x)}, Duplicated{typeof(y)})

# run the forward pass
tape, out, dout = fwd(Const(+), Duplicated(x, dx), Duplicated(y, dy))

#change dout to whatevre you want
dout .= 1

# run the reverse pass
rev(Const(+), Duplicated(x, dx), Duplicated(y, dy), tape)

Docs PR’s welcome! Pull requests · EnzymeAD/Enzyme.jl · GitHub

1 Like

Thanks, @wsmoses. I’ve been messing with other things and forgot about the limitation on non-scalar returns (even though it’s at the top of this page, :man_facepalming: ). The ultimate application will result in a scalar output, but I was trying to check intermediate steps that Enzyme would take as a sanity check on some of my types.

I’m still at a loss of what to do with SLArray without defining a custom rule for +. I guess if I’m not ever interested in the derivatives of + explicitly, then maybe it doesn’t matter.

I think I’ve already made at least one commit to the docs, when I first read through them. Maybe I’ll make some more in the future, but I should probably better understand what the heck is going on, first. :slight_smile:

You shouldn’t need a custom rule for that. What happens if you differentiate sum(a+b) or whatever is needed to return a float from an slarray

1 Like

That’s not an intermediate step. The point of reverse-mode AD is that it typically never computes the expensive intermediate step of a full Jacobian matrix. It takes advantage of the fact that you have a scalar output, and propagates the chain rule starting at that output and working backwards to the inputs.

The intermediate steps of reverse-mode AD are vector–Jacobian products. So, if you want to check this in a function f(x) that has a vector output, you could try differentiating v^T f(x) for a random vector v.


This does exactly what I want. Thanks!

I guess what I wrote and what I meant were different. I want to test on some simple function to check whether a custom type will work without a custom rule.

This is a very helpful clarification on how AD works and good idea for a test. Thanks, @stevengj!