No changes with Flux NN regression training

ToucheSir · October 1, 2020, 3:08pm

The biggest thing that jumps out to me is that the SGD loop is inside out . Currently you have

for batch in randsamples(data)
  for i in 1...nepoch
    sgd_step(model, batch, opt)

Whereas the correct ordering should be

for i in 1...nepoch
  for batch in randsamples(data)
    sgd_step(model, batch, opt)

Conceptually, you can think of the first approach as beating the model over the head with a single batch, then swapping to a completely different one and repeating. There may be some weight adaptation that is conducive to generalization, but most likely you’ll have overfit on the last minibatch because the model is being fed that batch repeatedly for 250 iterations/optimization steps right before evaluation. Unlike the semi-directed random walk on the loss landscape you would expect from SGD, this will look like a series of dramatic jerks without any real sense of direction.

Topic		Replies	Views
Problems with Flux NN regression Machine Learning question , package	1	408	November 19, 2021
Gradient not being computed when training NN using Flux Machine Learning question , flux , sciml	3	102	January 30, 2025
Simple Flux model not learning Machine Learning flux	4	1077	October 21, 2019
Partial Autodiff in NN using Flux New to Julia question , flux , machine-learning , neural-network	1	437	April 25, 2024
Flux function fitting Machine Learning flux	2	1046	August 7, 2020

No changes with Flux NN regression training

Related topics