ArgumentError ReverseDiff: cannot reinterpret

I attempt to used reverse mode differentiation. A very much simplified version of the function call is

using StaticArrays
using ReverseDiff

const Po{T} = SArray{Tuple{2},T,1,2}       # point in R2
function ff(x)
    T = eltype(x)
    y = reinterpret(Po{T}, x)

x = rand(10)

I get an argumenterror due to the reinterpret line. Is there some way to work around this? [Note that the function call may look like a superinefficient way to compute the sum of the input array; but in the actual program the reinterpret step is necessary.]

The exact error message is as follows

ERROR: ArgumentError: cannot reinterpret `ReverseDiff.TrackedReal{Float64,Float64,ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}}` `SArray{Tuple{2},ReverseDiff.TrackedReal{Float64,Float64,ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}},1,2}`, type `SArray{Tuple{2},ReverseDiff.TrackedReal{Float64,Float64,ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}},1,2}` is not a bits type

The quick answer is to use ForwardDiff.gradient(ff,x), whose tracking is via a dense array of bitstypes which play well with reinterpret.

You could write a function for reversediff which would reinterpret on the forward pass, and un-do that on the reverse. But I think it needs to produce a vector of tracked SVectors, not a tracked vector of SVectors, which I don’t think is so easy.

I think Zygote plays better with arrays of arrays than do Tracker/ReverseDiff, but it doesn’t work out of the box here. It is however fairly easy to do some parts via ForwardDiff while the overall program is using Zygote.

Thanks. I tried Zygote as well. It completely crashes on my computer and Julia gets restarted.

Zygote errors FWIW are:

julia> Zygote.gradient(ff,x)  # hard case
ERROR: Need an adjoint for constructor Base.ReinterpretArray{SVector{2, Float64}, 1, Float64, Vector{Float64}, false}. Gradient is of type FillArrays.Fill{FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}}, 1, Tuple{Base.OneTo{Int64}}}

julia> function ff2(x)
           T = eltype(x)
           y = reinterpret(Po{T}, x)
           sum(identity, sum(v -> v.^2, y))    
ff2 (generic function with 1 method)

julia> ff2(rand(10))

julia> Zygote.gradient(ff2,x)  # easy case
ERROR: Need an adjoint for constructor Base.ReinterpretArray{SVector{2, Float64}, 1, Float64, Vector{Float64}, false}. Gradient is of type Vector{Vector{Float64}}

These might be solvable by adding rules for reinterpretation, but I haven’t tried. (There are some issues about SVectors IIRC.)

What I did try is hooking up ForwardDiff over SVector slices of an array, to Zygote/Tracker on the un-sliced array, here: SliceMap.jl/SliceMap.jl at master · mcabbott/SliceMap.jl · GitHub . That doesn’t try to handle ReverseDiff although it would work exactly the same way as Tracker. Depending what your actual problem looks like, some similar approach may work. I think SVectors and ForwardDiff are quite natural friends, they are both tuples with special interpretations. If slices are small enough to want SVectors, then they are small enough that the overhead of attaching reverse-mode tracing machinery to each of them is probably going to be large.

ForwardDiff runs fine on my code, but it seems to scale badly with dimension. For that reason I wanted to assess performance of ReverseDiff. The reinterpret can of course be avoided by allocating a new var y. I think it would be useful to have some guidelines / general advice on writing Julia code, having automatic differentiation in mind. I have only found relatively simple examples so far.