I try to find an easy-to-use package or example for training a forward-feed single-layer neural network. In R, the neuralnet and nnet packages are easy to use for a layman like me. But, I couldnβt find an equivalent package in Julia. My model is very simple so I do not need a well-rounded package like Flux. I just need to fit a neural network to predict 4 dependent variables from 80 independent variables. Any suggestion would be appreciated.
Flux is perfect for this. I use it for two layer multilayer perceptions all the time.
If you do not care too much about performance, you can just build it by hand and use e.g. ForwardDiff.jl for the derivatives.
I think this is exactly the user case for which I created BetaMLβ¦ Yeah, it doesnβt have nowhere the performances/features of Flux or Knet, but it is very simply to use and its source code, I believe, is very easy to read also for people beginners in Julia:
This is the simplest possible usage of defining/training a neural network in BetaML:
using BetaML.Nn, StatsPlots
# Set-up random data...
xtrain = rand(1000,80)
ytrain = vcat([[sum(x[1:10]) sum(x[21:50]) sum(x[51:70]) sum(x[71:80])] for x in eachrow(xtrain)]...)
xtest  = rand(100,80)
ytest  = vcat([[sum(x[1:10]) sum(x[21:50]) sum(x[51:70]) sum(x[71:80])] for x in eachrow(xtest)]...)
# Define the network...
l1     = DenseLayer(80,4)               # Default to identity as activation function and Xavier weigth initialisation
mynn   = buildNetwork([l1],squaredCost) # Build the NN using the squared cost (aka MSE) as error function
# Train the network...
res    = train!(mynn,xtrain,ytrain)     # Use optAlg=SGD() to use Stochastic Gradient Descent instead
# Get predictions...
yΜtrain = predict(mynn,xtrain)           # Note the scaling function
yΜtest  = predict(mynn,xtest)
# Check goodness of fit...
MeanRelativeError = meanRelError(yΜtest,ytest)
RelariveMeanError = meanRelError(yΜtest,ytest,normDim=false,normRec=false)
scatter(ytrain,yΜtrain, title="training set", xlabel="ytrain",ylabel="yΜtrain")
scatter(ytest,yΜtest, title="testing set", xlabel="ytest",ylabel="yΜtest")
Training..   avg Ο΅ on (Epoch 1 Batch 31):    269.39832067713127
Training the Neural Network...  1%|ββ                                                                                                                                                    |  ETA: 0:30:34
Training..    avg Ο΅ on (Epoch 10 Batch 31):   36.090238898665426
Training..   avg Ο΅ on (Epoch 20 Batch 31):   4.746713028974144
Training the Neural Network... 24%|ββββββββββββββββββββββββββββββββββββ                                                                                                                  |  ETA: 0:01:02
Training..    avg Ο΅ on (Epoch 30 Batch 31):   3.178440735560084
Training..   avg Ο΅ on (Epoch 40 Batch 31):   2.1860659815722703
Training the Neural Network... 47%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ                                                                               |  ETA: 0:00:23
Training..    avg Ο΅ on (Epoch 50 Batch 31):   2.889643634191205
Training..   avg Ο΅ on (Epoch 60 Batch 31):   2.0260411407898813
Training..   avg Ο΅ on (Epoch 70 Batch 31):   2.047279748794773
Training the Neural Network... 71%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ                                           |  ETA: 0:00:09
Training..    avg Ο΅ on (Epoch 80 Batch 31):   1.7450182174502227
Training..   avg Ο΅ on (Epoch 90 Batch 31):   1.6243245502906725
Training the Neural Network... 94%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ         |  ETA: 0:00:01
Training..    avg Ο΅ on (Epoch 100 Batch 31):      1.4598596834192916
Training the Neural Network...100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| Time: 0:00:22
Training of 100 epoch completed. Final epoch error: 1.5223635773193913.
Of course you can get much better results (in general) by scaling the variables, adding further layer(s) and/or tuning their activation functions or the optimisation algorithm (have a look at the notebooks or at the documentation for that), but the idea is that while we can offer a fair level of flexibility (you can choose or define your own activation function, easy define your own layers, choose weight initialisation, choose or implement the optimisation algorithm and its parameters, choose the training parameters - epochs, batchsize,β¦-, the callback function to get informations during (long) training,β¦), still we try to keep it one step at the time. So for most stuff we provide default parameters that can be overridden when needed rather than pretend that the user already know and provide all the needed parameters.
StructuredOptimization can also handle simple neural nets, although it doesnβt come with stochastic optimisation techniques (but you can build those on top I believe). Plus, it currently doesnβt support GPU.
A simple demo is here.

