Dear all,
I want to use Flux.jl
to build a simple Multi-Layer Perceptron (MLP) as I did in Keras, where the input data is a matrix of nGene
(number of genes) by nInd
(number of individuals), output data is a vector of length nInd
to represent a trait (e.g. height). I also have two hidden layers with 64, 32 neurons, respectively.
In summary, the number of neurons is changed as: nGene
→ 64 → 32 → 1
In Keras, the MLP is:
# Instantiate
model = Sequential()
# Add first layer
model.add(Dense(64, input_dim=nGene))
model.add(Activation('relu'))
# Add second layer
model.add(Dense(32))
model.add(Activation('softplus'))
# Last, output layer
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, epochs=100)
From below, the loss (mse) of each epoch are less than one. The prediction accuracy of testing data is about 0.6, which is good.
In Flux.jl, I built the same MLP by:
data = Iterators.repeated((X_train_t, Y_train), 100)
model = Chain(
Dense(nGene, 64, relu),
Dense(64, 32, softplus),
Dense(32, 1))
loss(x, y) = Flux.mse(model(x), y)
ps = Flux.params(model)
opt = ADAM()
evalcb = () -> @show(loss(X_train_t, Y_train))
Flux.train!(loss, params(model), data, opt, cb = evalcb)
Here X_train_t
is a nGene
by nInd
matrix, Y_train
is a vector of length nInd
.
The loss is very very high, and the prediction accuracy of testing data is almost zero.
BTW, in Flux.jl
, if I change the optimiser to gradient descent, it even didn’t converge.
I really don’t know why the training process from Flux.jl
is wrong, could you please give me a hint on what’s wrong with my code?
Thank you very much,
-Carol