Hello, I am trying to implement the A3C reinforcement learning algorithm in Flux.jl using a custom loss function described in the algorithm. I am having trouble with my custom loss function - the parameters of my model do not update when I compute gradients via my custom loss function. A demonstrative MWE is:
using Flux # set up common model params state_dim = 5 action_dim = 4 model = Chain(Dense(state_dim, 128, relu), Dense(128, action_dim+1) ) θ = params(model) opt = ADAM() # this loss function does not work π_sa = 0.3 A_sa = 10 actor_loss_function(π_sa, A_sa) = log(π_sa)*A_sa dθ = gradient(()->actor_loss_function(π_sa, A_sa), θ) display(θ) Flux.update!(opt, θ, dθ) display(θ) # unchanged # this loss function works s_t = rand(5) a_t = rand(5) loss(x, y) = Flux.Losses.mse(model(x), y) dθ_mse = gradient(()->loss(s_t, a_t), θ) display(θ) Flux.update!(opt, θ, dθ_mse) display(θ) # changed
Does anyone have an idea of why my custom loss function is not working? Both loss functions return scalar quantities in this example. The
actor_loss_function() seems to generally return negative values (
π_sa = 0.3 is a probability which is usually less than 1 so the
log() turns it negative) and
Flux.Losses.mse generally returns a positive value if that makes a difference. Any thoughts/feedback is greatly appreciated.