Gradient of a Hessian dependent loss function

mtgd · November 20, 2019, 6:27pm

For my application it would be useful to have a loss function that depends on the Hessian of a model.
Computing the Hessian with respect to the input variable (using Tracker) works fine, but trying to get the gradient with respect to the parameters doesn’t work if matrix multiplication (such as for Dense layers) is involved.

Consider the following example:

W = param(rand(1, 2))
fn = z -> sum(W * z)^2
hess = z -> Tracker.hessian(fn, z)
Tracker.gradient(() -> sum(hess([1.0, 2.0])), Flux.Params([W]))

This throws the error:
MethodError: Cannot `convert` an object of type Base.ReshapedArray{Float64,2,Transpose{Float64,Array{Float64,2}},Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}} to an object of type Transpose{Float64,Array{Float64,2}}

Any thoughts or advice appreciated.

I suspect this might be a bug somewhere, but it’s hard to track down where. It would already help to know which package is responsible, to submit an issue.

jkbest2 · November 21, 2019, 6:35pm

Unfortunately I don’t have anything to add here except that I’m also interested in being able to get gradients of a Hessian-dependent function. In this case it might be worth filing an issue with Tracker or trying Zygote which is intended to replace Tracker.

Topic		Replies	Views
[Flux] Issue while computing Hessian MethodError: objects of type Tracker.Grads are not callable Machine Learning flux	12	1369	August 28, 2019
How to compute hessian Machine Learning	1	868	August 26, 2019
Hessian inside a Flux loss function Machine Learning question	3	794	February 26, 2021
Calculation of Hessian of a loss function w.r.t. dense layer weight matrices Performance hessian	0	234	September 9, 2023
Hessian matrix of ML model General Usage flux , zygote , forwarddiff , reversediff	9	2242	April 28, 2021

Gradient of a Hessian dependent loss function

Related topics