Flux - support for mini-batches


Does Flux.train! have support for mini-batches out of the box? If it does I couldn’t see it.

Does it mean in need to restructure my data and rewrite the loss function to act on the mini-batch at once?


Check out MlDataUtils.jl, it should have the support you need and works well with flux

1 Like

Minibatch is supported, more-or-less automatically. One just has to structure the data provided to the train! method in the appropriate way. Here is my understanding:

In ordinary SVD one calls train! with a data argument of the form

data = [(x1, y1), (x2, y2), ... , (xN, yN)]

For k mini-batches of size 3 (say) use instead

data = [(X1, Y1), (X2, Y2), ... (Xk, Yk)]


X1 = cat(x1, x2, x3, dims=px), X2 = cat(x4, x5, x6, dims=px), etc
Y1 = cat(y1, y2, y3, dims=py), Y2 = cat(y4, y6, y7, dimes=py), etc

Here px is one more than the dimension of inputs (e.g., 3 for grey-scale images) and py is one more than the dimension of outputs. For an example, see this example from the Flux model-zoo

This “just works” because methods like the provided loss functions and the model functions (e.g., model = Dense(2,3)) can be called on multiple instances of data arguments by just concatenating the data along the last dimension.

(In this version of mini-batch the gradients are summed rather than averaged, so you may want divide the SVD learning rate by the batch-size.)