Network Not Updating, Flux Julia

albheim · September 16, 2022, 9:43am

Nishant_Mohanty:

function actor_model(state_dim)
    return Chain(
            Dense(state_dim, 100),
            Dense(100, 200),
            Dense(200, 150),
            # Dense(150, 150),
            Dense(150,1,tanh))
end

Creating multiple dense layers without activation is not going to give your function any more flexibility since it then is just an affine transform, but it will add a bunch of extra parameters which slows down learning in most cases. I would either have the single layer with Dense(100, 1, tanh) or put some activations in the intermediate layers.

Removing the relu from the critic seems reasonable, otherwise you might randomly get zero gradients depending on the initialization and data. If you make larger updates with more data it could maybe be more okay to keep it since then it is more plausible that some of the data will still generate a positive output and thus lead to some gradient.

It also seems the actor has a similar problem, when I ran it and checked the value in s1 it was -1, indicating it is really negative before the tanh and thus will have a very small gradient, and checking the gradient of the action w.r.t. the parameters then really seem to be 0 or very close at least. Testing the gradients with some random number gives non-zero values.

julia> actor(s1)
1-element Vector{Float64}:
 -1.0

julia> gs = Flux.gradient(() -> sum(actor(s1)), Flux.params(actor))
Grads(...)

julia> gs.grads
IdDict{Any, Any} with 9 entries:
  Float32[0.0]                      => Float32[0.0]
  Float32[-0.0531993 -0.0839223 … … => Float32[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0…
  Float32[0.0, 0.0, 0.0, 0.0, 0.0,… => Float32[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0,…
  Float32[-0.0853877 0.130838 … 0.… => Float32[0.0 -0.0 … -0.0 0.0]
  Float32[-0.0815599 -0.130409 … -… => Float32[0.0 0.0 … -0.0 0.0; 0.0 0.0 … -0.0 0.0; … ; 0.0 0.0 … -0.…
  Float32[0.0633241 0.111354 … -0.… => Float32[-0.0 0.0 … -0.0 -0.0; -0.0 0.0 … -0.0 -0.0; … ; -0.0 0.0 …
  Float32[0.0, 0.0, 0.0, 0.0, 0.0,… => Float32[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0,…
  Float32[0.0, 0.0, 0.0, 0.0, 0.0,… => Float32[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0,…
  :(Main.s1)                        => [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,…

julia> gs = Flux.gradient(() -> sum(actor(randn(size(s1)))), Flux.params(actor))
Grads(...)

julia> gs.grads
IdDict{Any, Any} with 8 entries:
  Float32[0.0]                      => Float32[0.688863]
  Float32[-0.0531993 -0.0839223 … … => Float32[-0.0834978 0.0162966 … -0.100557 0.141133; -0.0920515 0.0…
  Float32[0.0, 0.0, 0.0, 0.0, 0.0,… => Float32[0.010946, 0.105696, -0.0541519, -0.0183322, 0.000768424, …
  Float32[-0.0853877 0.130838 … 0.… => Float32[-0.0757515 0.195297 … -0.215118 -0.0997762]
  Float32[-0.0815599 -0.130409 … -… => Float32[0.00461565 -0.00612626 … -0.00670385 -0.00224172; 0.04456…
  Float32[0.0633241 0.111354 … -0.… => Float32[-0.0199989 -0.0170654 … -0.0421886 -0.0202436; 0.0306439 …
  Float32[0.0, 0.0, 0.0, 0.0, 0.0,… => Float32[-0.0588204, 0.0901291, -0.0234954, 0.0110632, -0.0666493,…
  Float32[0.0, 0.0, 0.0, 0.0, 0.0,… => Float32[-0.0973696, -0.107344, 0.0529834, 0.0645531, -0.00943013,…

Recreating the actor I got different results (since the random initialization was different)

ulia> actor = gpu(actor_model(state_dim1))
Chain(
  Dense(14 => 100),                     # 1_500 parameters
  Dense(100 => 200),                    # 20_200 parameters
  Dense(200 => 150),                    # 30_150 parameters
  Dense(150 => 1, tanh),                # 151 parameters
)                   # Total: 8 arrays, 52_001 parameters, 203.629 KiB.

julia> actor(s1)
1-element Vector{Float64}:
 0.999993465657055

julia> gs = Flux.gradient(() -> sum(actor(randn(size(s1)))), Flux.params(actor))
Grads(...)

julia> gs.grads
IdDict{Any, Any} with 8 entries:
  Float32[0.182963 0.0816542 … 0.1… => Float32[-0.0372346 -0.0957257 … 0.0396487 0.0540225]
  Float32[-0.0363748 -0.103407 … -… => Float32[0.0134421 0.0336876 … 0.0912891 -0.0231217; 0.0151213 0.0…
  Float32[-0.0789767 0.225775 … -0… => Float32[0.0548423 0.0282836 … 0.049636 -0.108771; 0.0129213 0.006…
  Float32[0.0, 0.0, 0.0, 0.0, 0.0,… => Float32[-0.0767889, -0.0180921, -0.122244, 0.00583031, -0.0116575…
  Float32[0.0]                      => Float32[0.999765]
  Float32[0.0659612 0.0139627 … -0… => Float32[-0.152804 0.0411203 … -0.109576 0.0850265; -0.0681948 0.0…
  Float32[0.0, 0.0, 0.0, 0.0, 0.0,… => Float32[0.18292, 0.081635, -0.0876073, 0.00315569, 0.192231, 0.16…
  Float32[0.0, 0.0, 0.0, 0.0, 0.0,… => Float32[0.0947302, 0.106564, -0.0905477, 0.100901, -0.0278349, -0…

so now it seems like I do get non-zero gradients through the actor. This also then worked in the full update together with the critic to make updates to the agent network.

Topic		Replies	Views
Flux Custom Loss Function Not Working Properly Machine Learning flux , zygote	20	2322	April 2, 2021
Params not getting updated during training New to Julia flux	25	1789	October 11, 2020
The same network performs differently in Flux.jl and tensorflow Machine Learning performance	13	3117	December 18, 2019
Different behaviour between Flux.jl and Pytorch Machine Learning machine-learning	17	2356	February 13, 2021
Parameters not updating in Flux Machine Learning flux	20	1589	July 9, 2021

Network Not Updating, Flux Julia

Related topics