Problem with first example with Flux

Richard_Wright · September 28, 2022, 3:30am

Hello,

I am trying the very first example in flux from Overview · Flux. The code is given below. It works fine when

hcat(0:5...) but doesn’t when hcat(0:6...). Why?

using Flux

using Flux: train!

import Random

Random.seed!(1)

actual(x) = 4x + 2
loss(x, y) = Flux.Losses.mse(predict(x), y)
x_train = hcat(0:5...) 
x_test = hcat(7:14...)
y_train = actual.(x_train)
y_test = actual.(x_test)
print(y_train, "\n", y_test,"\n")
predict = Dense(1 => 1)
opt = Descent()
data = [(x_train, y_train)]
parameters = Flux.params(predict)
predict.weight in parameters, predict.bias in parameters
for epoch in 1:200
train!(loss, parameters, data, opt)
end
print(loss(x_train, y_train), "\n")
parameters

Output with 5

[2 6 10 14 18 22]
[30 34 38 42 46 50 54 58]
0.009775136
Params([Float32[3.9697118;;], Float32[1.9914621]])

Output with 6

[2 6 10 14 18 22 26]
[30 34 38 42 46 50 54 58]
NaN
Params([Float32[NaN;;], Float32[NaN]])

albheim · September 28, 2022, 7:13am

Try adding the line @show parameters in your training loop and you’ll see how the parameters oscillates with larger and larger steps. One way to get around this is to take smaller steps in your optmizer, for Decent the default value is 0.1 so I would suggest trying with something smaller than that.

The reason why it happens in this specific case is that the magnitude of the error is
|wx+b-4x-2|=|(w-4)x+b-2|
which at some point will just get larger with larger x, also creating gradients that become increasingly large with larger x. Therefore it is expected for this model that if you take a range of x=0:n there will be some n after which you get gradient steps large enough to cause unstable updates that just grow to infinity. Lowering the optimizer step size will allow for larger n, though you will still hit a new higher limit at some point.

Topic		Replies	Views
Getting NaNs in the hello world example of Flux Machine Learning question	2	744	October 28, 2021
Params not getting updated during training New to Julia flux	25	1736	October 11, 2020
Problems using Flux New to Julia	7	441	June 6, 2023
Generic Function to train NN w/ Flux Machine Learning flux	7	1648	April 14, 2020
No changes with Flux NN regression training Machine Learning question	2	659	October 2, 2020

Problem with first example with Flux

Related topics