Is there an efficient way to compute the Hessian of a NN?

Thank you both for your answers. And thank you for this nice paper, I understand AD much better now. I wanted to attempt both RoF and double forward methods. But I encounter a stack overflow issue when I use ForwardDiff methods on a neural network. Only the jacobian method takes an AbstractArray as input and accepts AbstractArrays as output. The other assume one of the two is Real. However:

f = Chain(Dense(10,5,elu),Dense(5,3))
x = rand(10)
ForwardDiff.jacobian(f, x) 

Throws a stack overflow. I assume this method does n forward passes, one for each input. But the underlying ForwardDiff.gradient cannot take multi-dimensional output functions as argument. Isn’t it weird for a forward accumulation mode to not be able to do that ? According to the paper from Baydin et al. it is quite straightforward to do. Same for the ForwardDiff.derivative function, it only handle functions with Real inputs.

Am I missing something ?