Hi Everyone,
I am new to Julia and this is my first post in the forum. I love the language, but I still struggle a little bit.
For example, I am using flux to train a Neural Network for MNIST and I now need to sample a percentage of my flux-oriented dataset. Let me explain:
Here’s what I do:
# Get data from MNIST
train_x = dataset_x[:, :, 1:50_000]
train_y = dataset_y[1:50_000]
# Transform the data
train_x = reshape(train_x, (:, size(train_x,3))) |> gpu
train_y = onehotbatch(train_y ,0:9) |> gpu
# Create the training tuple that Flux accepts
trainingset = [(train_x, train_y)];
The above is a tuple of arrays. It has size (1,). This is apparantly the only format Flux accepts when I do:
Flux.train!(loss_function, params(model), trainingset, optimiser)
As mentioned, I need to sample from trainingset
with X’s and y’s in the same order.
I am currently using
sampling = sample(trainingset,trunc(Int, round(size(trainingset[1][1],2)*j/100)),replace=false);
Where j is a number varying from 1 to 100 in a loop
This has two issues:
- It complains that
replace=false
- if I ignore replace, it returns nothing
Any tips?
I am open to:
a) Define a new way to build a training set (although zip(X,y)
didn’t work)
b) Define a new way to sample from the dataset
Thanks a lot!