Loss functions that involve gradients

balaji1975 · February 15, 2021, 2:45am

This is similar to this question.

I am training a neural net NN(), where the loss function involves gradient of NN(.) wrt to the input. For example:

m = Flux.Chain(Dense(5,5,relu), Dense(5,5,relu), Dense(5,1))
g(z) = only(m(z))

function loss(x,y)
    w = Flux.gradient(g, x)[1]
    yhat = dot(w, x) / g(x)
    return (yhat - y)^2
end

This does not work within the Flux framework. Currently, I train the network 1) “manually” using ForwardDiff.gradient where required and 2) where the network is small, using Optim.

Is there a way we can handle this within Flux? Many thanks!

Tomas_Pevny · February 15, 2021, 5:40am

I use such loss functions, but they are finicky. You need to be certain that gradients are all differentiable, which means that you have to in practice re-write some gradients, as they are not second-order gradients friendly. But in theory, zygote can do 2nd order gradients. Be aware that compilation time might be long.

balaji1975 · February 16, 2021, 2:29am

Thanks Tomas.

lazarusA · November 17, 2022, 9:02pm

@balaji1975 how do you do the training with ForwadDiff.gradient?

what code follows after your loss? Specifically how do you do the call for gradients and the update later of parameters?

balaji1975 · November 22, 2022, 2:08am

@lazarusA

It has been a long time, memory is sketchy. I ended up using just Flux with a custom loss function.

Essentially, i was trying to find a neural network N that minimized the error:
Screenshot 2022-11-22 at 10.05.33 AM

Topic		Replies	Views
Gradient error in Flux model inputs Machine Learning question , flux , zygote	5	1324	January 13, 2021
How to use gradient of neural network as the loss function? Machine Learning question	13	2739	March 23, 2021
Flux loss: Gradient wrt input leads to empty gradient wrt parameters or to "can't differentiate foreigncall" Machine Learning flux , forwarddiff , diffeqflux	3	558	April 8, 2022
DiffEqFlux Autodifferentiating inside loss function Modelling & Simulations question , diffeq , sciml	6	603	September 29, 2020
AD Troubles in Flux and Unusual Loss Machine Learning flux , zygote , fluxtraining	1	117	November 24, 2024

Loss functions that involve gradients

Related topics