How to train/predict a very simple feed-forward neural network in Flux?

sylvaticus · May 15, 2020, 4:57pm

I must be idiot but I can’t realise how to do very basic… no Convolution, RNN, gates,… just a basic plain feed-forward neural network design/testing/validation in Flux.

Somethink like:

l1 = FullyConnectedLayer(tanh,2,3,w=ones(3,2), wb=zeros(3))
l2 = FullyConnectedLayer(tanh,3,2, w=ones(2,3), wb=zeros(2))
l3 = FullyConnectedLayer(linearf,2,1, w=ones(1,2), wb=zeros(1))
mynn = buildNetwork([l1,l2,l3],squaredCost,name="Feed-forward Neural Network Model 1")

xtrain = [0.1 0.2; 0.3 0.5; 0.4 0.1; 0.5 0.4; 0.7 0.9; 0.2 0.1]
ytrain = [0.3; 0.8; 0.5; 0.9; 1.6; 0.3]
xtest  = [0.5 0.6; 0.14 0.2; 0.3 0.7; 2.0 4.0]
ytest  = [1.1; 0.36; 1.0; 6.0]

train!(mynn,xtrain,ytrain,maxepochs=10000,η=0.01,rshuffle=false,nMsgs=10)
errors(mynn,xtest,ytest) # 0.000196
for (i,r) in enumerate(eachrow(xtest))
  println("x: $r ŷ: $(predict(mynn,r)[1]) y: $(ytest[i])")
end

Iulian.Cioarca · May 15, 2020, 5:08pm

Did you check the Flux model_zoo for examples? I see they updated it quite recently. They also have the 60min blitz which could be useful.

Albert_Zevelev · May 16, 2020, 3:19am

@sylvaticus maybe my code will help:
Generic Function to train NN w/ Flux

sylvaticus · May 17, 2020, 11:50am

Thank you, that’s the clearer example I could find. Still I can’t get it working with toy data:

using Flux
xtrain     = [0.1 0.2; 0.3 0.5; 0.4 0.1; 0.5 0.4; 0.7 0.9; 0.2 0.1]
ytrain     = [0.3; 0.8; 0.5; 0.9; 1.6; 0.3]
xtest      = [0.5 0.6; 0.14 0.2; 0.3 0.7]
ytest      = [1.1; 0.36; 1.0]
# Direct way. Error: "Output should be scalar":
model      = Chain(Dense(2, 1))
loss(x, y) = Flux.mse,(model(x), y)
Flux.@epochs 200 Flux.train!(loss, model, Flux.Data.DataLoader(xtrain', ytrain'), ADAGrad())

# Using Albert_Zevelev's "f" function. Error: DimensionMismatch:
d = Flux.Data.DataLoader(xtrain', ytrain');
function f(d, XT, YT, XH, YH;
           m = Chain(Dense(size(XT,2), 1)), #Model/Activation
           ℓ = Flux.mse,                    #Loss: mse, crossentropy...
           #                                #Penalty: add later...
           opt = ADAGrad(),                 #Optimiser
           nE = 200                         #Number epochs
          )
  loss(x, y) = ℓ(m(x), y)
  Flux.@epochs nE Flux.train!(loss, params(m), d, opt)
  IS = Flux.mse(m(XT'), YT') |> sqrt
  OS = Flux.mse(m(XH'), YH') |> sqrt
  return IS, OS
end
f(d, xtrain, ytrain, xtest, ytest, m = Chain(Dense(12,1)), nE=    200)

I am sorry I really believe there is a problem with the documentation (obviously it is my own personal opinion): I had a look several times to the model_zoo, very useful… if you already know a bit, if you need a starting point for your own problem. But they are all implementation on specific areas, there isn’t there a “model zero” tutorial.
I believe there should be a very very very trivial example like the one I am trying to solve. No need to load data from Boston housing data or MNIST dataset, at this time this is a distraction.
No convolutional layers, recurrent neural networks, models with gates… just a plain model to show how to build a model, how to get predictions, how to train it and how to check performances.
After several hours I can’t still do it in Flux, and I feel very frustrated :-/

lhnguyen-vn · May 17, 2020, 1:07pm

I’m sorry that your experience has been less than ideal, but I found Flux’s documentation a really good starting point. The training section, for example, details how to set up a model.

Here’s how your toy example could be implemented:

using Flux

xtrain     = [0.1 0.2; 0.3 0.5; 0.4 0.1; 0.5 0.4; 0.7 0.9; 0.2 0.1]
ytrain     = [0.3; 0.8; 0.5; 0.9; 1.6; 0.3]
xtest      = [0.5 0.6; 0.14 0.2; 0.3 0.7]
ytest      = [1.1; 0.36; 1.0]

model = Dense(2, 1) # Use Chain if you want to stack layers

loss(x, y) = Flux.mse(model(x), y)
ps = params(model)
dataset = [(xtrain', ytrain')] # Use DataLoader for easy minibatching
opt = ADAGrad()

Flux.@epochs 100 Flux.train!(loss, ps, dataset, opt)

The train loop also takes an additional keyword argument cb for callbacks. For instance, if you want to see how the loss improves each epoch:

cb = () -> println(loss(xtrain', ytrain'))
Flux.@epochs 100 Flux.train!(loss, ps, dataset, opt, cb = cb)

shawngiese · May 26, 2021, 8:33pm

That is a great example and ran smoothly. I tried using DataLoader for batching but then I got errors about the dimensions.

data_batch = Flux.Data.DataLoader((xtrain', ytrain'), batchsize=6)
Flux.@epochs 100 Flux.train!(loss, ps, data_batch, opt) 
┌ Info: Epoch 1 
└ @ Main C:\Users\shawn\.julia\packages\Flux\6o4DQ\src\optimise\train.jl:135

DimensionMismatch("A has dimensions (13,13) but B has dimensions (2,6)")

shawngiese · May 26, 2021, 8:42pm

DataLoader was supposed to be the following I think:

data_batch = Flux.Data.DataLoader((xtrain’, ytrain), batchsize=6)

Topic		Replies	Views
Question about how to train a neural network using Flux.jl Machine Learning first-steps , flux	1	2339	October 8, 2018
Flux function fitting Machine Learning flux	2	1047	August 7, 2020
Flux results not similar to Tensorflow Machine Learning question	3	1815	March 11, 2019
Generic Function to train NN w/ Flux Machine Learning flux	7	1646	April 14, 2020
Problems with Flux Machine Learning	2	1596	March 14, 2018

How to train/predict a very simple feed-forward neural network in Flux?

Related topics