NN: how to choose the layers/neurons?

sylvaticus · May 15, 2020, 3:13pm

I am implementing my own NN “library”, but I am stuck on which is the best way to then implement the network itself, i.e. which type of layers to use, how many and how many neurons.

Let’s consider for example the following dataset:

xtrain = [0.1 0.2; 0.3 0.5; 0.4 0.1; 0.5 0.4; 0.7 0.9; 0.2 0.1; 0.4 0.2; 0.3 0.3; 0.6 0.9; 0.3 0.4; 0.9 0.8]
ytrain = [(0.1* x[1]+0.2* x[2]+0.3)* rand(0.9:0.001:1.1) for x in eachrow(xtrain)]
xtest  = [0.5 0.6; 0.14 0.2; 0.3 0.7; 20.0 40.0;]
ytest  = [(0.1* x[1]+0.2* x[2]+0.3)* rand(0.9:0.001:1.1) for x in eachrow(xtest)]

After lot of trial and errors I realised that something like this gives me good results, even for the last testing data that is intentionally outside the training range:

l1 = FullyConnectedLayer(linearf,2,3,w=ones(3,2), wb=zeros(3))
l2 = FullyConnectedLayer(linearf,3,1, w=ones(1,3), wb=zeros(1))
mynn = buildNetwork([l1,l2],squaredCost,name="Feed-forward Neural Network Model 1")
train!(mynn,xtrain,ytrain,maxepochs=10000,η=0.01,rshuffle=false,nMsgs=10)

But the fact that the last element is catched well, it is because of the strictly linear nature of the relation between x and y.

When I violate that, things start to become harder. I am unable to get good results with the following dataset:

xtrain = [0.1 0.2; 0.3 0.5; 0.4 0.1; 0.5 0.4; 0.7 0.9; 0.2 0.1; 0.4 0.2; 0.3 0.3; 0.6 0.9; 0.3 0.4; 0.9 0.8]
ytrain = [(0.1* x[1]^2+0.2* x[2]+0.3)* rand(0.95:0.001:1.05) for x in eachrow(xtrain)]
xtest  = [0.5 0.6; 0.14 0.2; 0.3 0.7; 20.0 40.0;]
ytest  = [(0.1*x[1]^2+0.2*x[2]+0.3)*rand(0.95:0.001:1.05) for x in eachrow(xtest)]

I did try other activation functions (ReLU, tanh,…) and added many more layers/neurons, but what I obtain is then that the NN returns always the same \hat Y whatever the input.

How can I implement a NN that could “learn” a potential non-linear relationship so to applying it also to out-of-sample data, as in the second dataset given?

More broadly, I learned how to make the algorithm to NN, that was easy, but what strategies should I now use for actual building the network ? Any reference on this topic ?

findmyway · May 15, 2020, 3:40pm

Have you tried ResNet?

sylvaticus · May 15, 2020, 3:45pm

I didn’t yet know them.
Do you think a ResNet would be able to give good results in a case like my second example ?
Are they implemented in a ML library like Flux of Knet ?

Topic		Replies	Views
Why does my neural net not learn Machine Learning question , flux	3	469	November 2, 2021
My own Feedforward Neural Network library :-) Machine Learning	28	4255	June 15, 2020
Why my NN overfitting? Machine Learning question	1	481	February 13, 2022
Misbehaving model / bad neural net architecture for the job? Machine Learning flux	2	300	November 5, 2022
How to train/predict a very simple feed-forward neural network in Flux? Machine Learning	6	2538	May 26, 2021

NN: how to choose the layers/neurons?

Related topics