SGD in Flux.jl

wujinq · June 18, 2021, 8:21am

I’m new to Flux.jl, and I’m kind of confused by the fact that there is no stochastic gradient descent algorithm. I checked the source code, and it seemed to me that Descent is just GD instead of SGD.

Of course we can just introduce some random factors into the training data. Anyway, what is the best practice to perform SGD in Flux.jl?

xiaodai · June 18, 2021, 10:30am

Not getting what you saying.

The S in SGD doesn’t come from choosing a random direction. It still moves in the direction that minimizes the loss. The SGD comes from passing in partial data batches e.g. if your data is 1_000_000 records you only pass it 32 records at a time. These 32 are randomly re-assigned each epoch, so your GD is stochastic by the randomness of where you are in the loss function and the randomness in the batch.

Hope that makes sense.

Topic		Replies	Views
SGD doesn't exist in Flux v0.10.0 New to Julia	1	529	December 21, 2019
Gradient Descent Optimizer in Flux.jl Machine Learning	1	1241	May 13, 2019
Two questions on Flux Machine Learning	23	4859	October 2, 2020
Gradient Descent, each datapoint is an array, in Flux General Usage flux	0	292	August 2, 2022
Stochastic gradient descent General Usage question , machine-learning	0	721	July 6, 2020

SGD in Flux.jl

Related topics