Sampling from a tuple of arrays

Hi Everyone,
I am new to Julia and this is my first post in the forum. I love the language, but I still struggle a little bit.

For example, I am using flux to train a Neural Network for MNIST and I now need to sample a percentage of my flux-oriented dataset. Let me explain:

Here’s what I do:

# Get data from MNIST 
train_x = dataset_x[:, :, 1:50_000]
train_y = dataset_y[1:50_000]

# Transform the data
train_x = reshape(train_x, (:, size(train_x,3)))  |> gpu
train_y = onehotbatch(train_y ,0:9) |> gpu

# Create the training tuple that Flux accepts
trainingset = [(train_x, train_y)];

The above is a tuple of arrays. It has size (1,). This is apparantly the only format Flux accepts when I do:

Flux.train!(loss_function, params(model), trainingset, optimiser)

As mentioned, I need to sample from trainingset with X’s and y’s in the same order.

I am currently using

sampling = sample(trainingset,trunc(Int, round(size(trainingset[1][1],2)*j/100)),replace=false);

Where j is a number varying from 1 to 100 in a loop

This has two issues:

  1. It complains that replace=false
  2. if I ignore replace, it returns nothing

Any tips?

I am open to:

a) Define a new way to build a training set (although zip(X,y) didn’t work)
b) Define a new way to sample from the dataset

Thanks a lot!