model
and grads[1]
are trees with the same nesting structure, and the same field names. Except that model
uses custom structs like Dense
, while grads
uses anonymous ones, NamedTuple
s.
Making a smaller example, here is how you can explore the two:
julia> model = Chain(Dense(2=>1), SkipConnection(Dense(1=>1),+))
Chain(
Dense(2 => 1), # 3 parameters
SkipConnection(
Dense(1 => 1), # 2 parameters
+,
),
) # Total: 4 arrays, 5 parameters, 292 bytes.
julia> grads = gradient(m -> sum(abs2, m([1,-1])), model)
((layers = ((weight = Float32[-2.6218274 2.6218274], bias = Float32[-2.6218274], σ = nothing), (layers = (weight = Float32[0.8607899;;], bias = Float32[-1.6526356], σ = nothing), connection = nothing)),),)
julia> model.layers[1]
Dense(2 => 1) # 3 parameters
julia> model.layers[1].weight # pressing tab will show you field names as you type
1×2 Matrix{Float32}:
-0.675822 -0.154963
julia> model.layers[1].bias # initialised to zero
1-element Vector{Float32}:
0.0
julia> grads[1].layers[1] # corresponding to Dense
(weight = Float32[-2.6218274 2.6218274], bias = Float32[-2.6218274], σ = nothing)
julia> grads[1].layers[1].weight
1×2 Matrix{Float32}:
-2.62183 2.62183
julia> grads[1].layers[1].bias
1-element Vector{Float32}:
-2.6218274
One catch is that model[2]
also works, the same as model.layers[2]
, but won’t work on the gradient: grads[1][2]
is an error. (Indexing a Chain indexes the tuple inside, but won’t work this way on a NamedTuple.)
(They aren’t always strictly trees, the same object can appear twice, but usually they are.)