Incorrect function evaluation in Zygote when swapping variables

I have a relatively concise routine cheby that evaluates Φ = exp(-i H dt) Ψ for a Hermitian matrix H and a complex vector Ψ by expanding the exponential into Cheybchev polynomials. A minimal working example adapted from the cheby routine in QuantumPropagators.jl is pasted below.

I’m finding that I get different results if I run cheby directly or inside Zygote.withgradient (where the Zygote result is wrong). I’ve managed to pinpoint it to one specific line that does a cyclic permutation of three vectors, v0, v1, v2 = v1, v2, v0. It works if I rewrite it as aux = 1 * v0; v0 = 1 * v1; v1 = 1 * v2; v2 = 1 * aux (and only if I have the factor 1 in there).

Does anybody have any insight into what is going on here? Should I simply report this as a bug in Zygote?

The full MWE is this:

using Zygote
using LinearAlgebra
using Distributions
using Test

function random_hermitian_matrix(N, specrad=1.0)
    σ = 1 / √N
    d = Normal(0.0, σ)
    X = (rand(d, (N, N)) + rand(d, (N, N)) * 1im) / √2
    H = specrad * (X + X') / (2 * √2)
end


function random_state_vector(N)
    Ψ = rand(N) .* exp.((2π * im) .* rand(N))
    Ψ ./= norm(Ψ)
    return Ψ
end


function cheby(Ψ, H, dt)
    a = [0.9915910021578431, 0.18282201929219635, 0.008403088661031203, 0.000257307553262815]
    Δ = 6.0; E_min = -3.0
    β = (Δ / 2) + E_min
    c = -2im / Δ

    v0 = Ψ
    ϕ = a[1] * v0
    v1 = c * (H * v0 - β * v0)
    ϕ = ϕ + a[2] * v1

    c *= 2
    for i = 3:length(a)
        v2 = c * (H * v1 - β * v1) + v0
        ϕ = ϕ + a[i] * v2
        v0, v1, v2 = v1, v2, v0                                # doesn't work
        #aux = v0; v0 = v1; v1 = v2; v2 = aux                   # doesn't work
        #aux = 1 * v0; v0 = 1 * v1; v1 = 1 * v2; v2 = 1 * aux   # works
    end

    return exp(-1im * β * dt) * ϕ
end

N = 2
dt = 0.1

Ψ0 = random_state_vector(N)
Ψ1 = random_state_vector(N)
H0 = random_hermitian_matrix(N)
H1 = random_hermitian_matrix(N)

ϵ = 1.0
res1 = abs2(Ψ1 ⋅ cheby(Ψ0, H0 + ϵ * H1, dt))
res2, grad = Zygote.withgradient(ϵ -> abs2(Ψ1 ⋅ cheby(Ψ0, H0 + ϵ * H1, dt)), ϵ)

@test abs(res1 - res2) < 1e-12

Ok, I submitted an issue: https://github.com/FluxML/Zygote.jl/issues/1198