Hi,
Does Flux.train! have support for mini-batches out of the box? If it does I couldn’t see it.
Does it mean in need to restructure my data and rewrite the loss function to act on the mini-batch at once?
Thanks
Hi,
Does Flux.train! have support for mini-batches out of the box? If it does I couldn’t see it.
Does it mean in need to restructure my data and rewrite the loss function to act on the mini-batch at once?
Thanks
Check out MlDataUtils.jl, it should have the support you need and works well with flux
Minibatch is supported, more-or-less automatically. One just has to structure the data provided to the train!
method in the appropriate way. Here is my understanding:
In ordinary SVD one calls train! with a data
argument of the form
data = [(x1, y1), (x2, y2), ... , (xN, yN)]
For k
mini-batches of size 3
(say) use instead
data = [(X1, Y1), (X2, Y2), ... (Xk, Yk)]
where
X1 = cat(x1, x2, x3, dims=px), X2 = cat(x4, x5, x6, dims=px), etc
Y1 = cat(y1, y2, y3, dims=py), Y2 = cat(y4, y6, y7, dimes=py), etc
Here px
is one more than the dimension of inputs (e.g., 3 for grey-scale images) and py
is one more than the dimension of outputs. For an example, see this example from the Flux model-zoo
This “just works” because methods like the provided loss functions and the model functions (e.g., model = Dense(2,3)
) can be called on multiple instances of data arguments by just concatenating the data along the last dimension.
(In this version of mini-batch the gradients are summed rather than averaged, so you may want divide the SVD learning rate by the batch-size.)