Errors when trying to compute hessian of flux neural net and jacobian of jacobian with zygote

Hi, I’m trying to translate some of my code in Jax over to Julia and running into some errors when trying to use Zygote with Flux neural networks.

I have two problems:

  1. I get an error when trying to compute the Hessian of a neural network function, and this just may be my misunderstanding of the syntax, although I’ve tried it multiple ways.
  2. I get an error when trying to compute the jacobian of another function that involves the jacobian of yet another function.

Here is my code:

using Flux, DiffEqFlux, Zygote
using OrdinaryDiffEq, Random
using LinearAlgebra
import Tracker

struct CustomModel
chain::Chain
end

function(m::CustomModel)(x,dx)
input = vcat(x,dx)
return m.chain(input)
end

f = Chain(
Dense(4, 8, swish),
Dense(8, 8, swish),
Dense(8, 8, swish),
Dense(8, 8, swish),
Dense(8, 1)
)

g = Chain(
Dense(4, 8, swish),
Dense(8, 8, swish),
Dense(8, 8, swish),
Dense(8, 8, swish),
Dense(8, 1)
)

h = Chain(
Dense(4, 8, swish),
Dense(8, 8, swish),
Dense(8, 8, swish),
Dense(8, 8, swish),
Dense(8, 1)
)

Flux.@functor CustomModel

T = CustomModel(f)
U = CustomModel(g)
R = CustomModel(h)

L(x,dx) = T(x,dx) .- U(x,dx)
pₖ(x,dx) = jacobian(L,x,dx)[2]

function H(x, dx)
hamilt = dot(pₖ(x,dx), dx) .- L(x,dx)
end

M(x,dx) = Zygote.hessian(T(x,dx), [x, dx])

function dp(x,dx)
dp = jacobian(H,x,dx)[1] .- jacobian(R,x,dx)[2]
end

println(M(rand(2),rand(2))) # error 1
println(dp(rand(2),rand(2))) # error 2

And the errors that I get:

  1. ArgumentError: Cannot create a dual over scalar type Any. If the type behaves as a scalar, define ForwardDiff.can_dual(::Type{Any}) = true.
  2. Mutating arrays is not supported – called copyto!(::SubArray{Float64, 1, Matrix{Float64}, Tuple{Int64, Base.Slice{Base.OneTo{Int64}}}, true}, _…)

I’d appreciate any advice. I’m brand new to Julia so I apologize if this is a silly question.

For reference, in jax I am using the jax.jacfwd and jax.hessian functions to accomplish the same thing.

Thank you!

Zygote.hessian works by calling ForwardDiff, and is thus limited to arrays of numbers. You can work around this using destructure:

julia> Zygote.hessian(x -> sqrt(sum(abs2, x)), [1.0, 2.0])  # fwd over rev
2×2 Matrix{Float64}:
  0.357771  -0.178885
 -0.178885   0.0894427

julia> ForwardDiff.hessian(x -> sqrt(sum(abs2, x)), [1.0, 2.0])  # fwd over fwd
2×2 Matrix{Float64}:
  0.357771  -0.178885
 -0.178885   0.0894427

julia> Zygote.hessian(x -> sqrt(sum(sum, x)), [[1.0], [2.0, 3.0]])
ERROR: ArgumentError: Cannot create a dual over scalar type Any. If the type behaves as a scalar, define ForwardDiff.can_dual(::Type{Any}) = true.

julia> ForwardDiff.hessian(x -> sqrt(sum(sum, x)), [[1.0], [2.0, 3.0]])  # different error but same problem
ERROR: MethodError: no method matching one(::Type{Vector{Float64}})

julia> v, re = Flux.destructure([[1.0], [2.0, 3.0]])
([1.0, 2.0, 3.0], Restructure(Array, ..., 3))

julia> Zygote.hessian(v -> sqrt(sum(sum, re(v))), v)
3×3 Matrix{Float64}:
 -0.0170103  -0.0170103  -0.0170103
 -0.0170103  -0.0170103  -0.0170103
 -0.0170103  -0.0170103  -0.0170103

julia> ForwardDiff.hessian(v -> sqrt(sum(sum, re(v))), v)
3×3 Matrix{Float64}:
 -0.0170103  -0.0170103  -0.0170103
 -0.0170103  -0.0170103  -0.0170103
 -0.0170103  -0.0170103  -0.0170103

This could be better documented, we’d be very happy to have an issue, or better, and attempt at editing the docstring here.

Zygote.jacobian wasn’t written with being Zygote-differentiable in mind, unfortunately. I guess the issue is this one.