DiffEqFlux.sciml_train resulting parameters

I am following an example here. The neural network is defined as Chain(Dense(2,50,tanh),Dense(50,2)). The resulting “.minimizer” array of values has 252 entries. How are each of these entries mapped to the W1, b1, W2, b2 matrices? It looks to me like the first 100 values are the entries of the 50x2 W1 matrix, and these 100 values correspond to an ordering of top-to-bottom & left-to-right within the W1 matrix. The next 50 elements (of 252) would define the entries of b1, ordered top-to-bottom. The next 100 elements (of 252) would define the entries of the 2x50 matrix W2, and these 100 values correspond to an ordering of top-to-bottom & left-to-right within the W2 matrix. Finally, the last 2 values of the 252-length “.minimizer” array are the 2 elements in b2, ordered top-to-bottom.

Can someone please confirm how all of the entries of the resulting optimization (".minimizer" array resulting from DiffEqFlux.sciml_train) map to the W_i, b_i matrices for each of the layers specified in the Chain(…) object?


It holds it in the vector in the same order that you give it. It is probably easiest to visualize if you restructure your values.

Thank you, Chris. I’m getting “ERROR: LoadError: UndefVarError: restructure not defined” when I try Flux.restructure(dudt2)) or DiffEqFlux.restructure(dudt2)). If I try Flux.params(dudt2) that gives me matrices W1, b1, W2, b2 (252 elements total), and the structure seems to make sense. However, I see the “pre-trained” 252 values (all b1, b2 entries are 0). Is there some way I can see the 252 values from res2.minimizer, but printed out the same way (4 distinct matrices) when I used Flux.params(dudt2)? I am not sure if there is some way to use Flux.params(…) to do that. I would then check that the values are the same between res2.minimizer and Flux.params(…) (or some other method). Thanks for clarifying.

p,re = Flux.destructure(dudt2) gives re as the restructure function re(p).

I was able to do Flux.params(re(res2.minimizer)), and I see the parameters in four distinct W1, b1, W2, b2 matrices. Thanks, Chris. I appreciate it.