I want to use
Flux.jl to build a simple Multi-Layer Perceptron (MLP) as I did in Keras, where the input data is a matrix of
nGene (number of genes) by
nInd (number of individuals), output data is a vector of length
nInd to represent a trait (e.g. height). I also have two hidden layers with 64, 32 neurons, respectively.
In summary, the number of neurons is changed as:
nGene --> 64 --> 32 --> 1
In Keras, the MLP is:
# Instantiate model = Sequential() # Add first layer model.add(Dense(64, input_dim=nGene)) model.add(Activation('relu')) # Add second layer model.add(Dense(32)) model.add(Activation('softplus')) # Last, output layer model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam') model.fit(X_train, y_train, epochs=100)
From below, the loss (mse) of each epoch are less than one. The prediction accuracy of testing data is about 0.6, which is good.
In Flux.jl, I built the same MLP by:
data = Iterators.repeated((X_train_t, Y_train), 100) model = Chain( Dense(nGene, 64, relu), Dense(64, 32, softplus), Dense(32, 1)) loss(x, y) = Flux.mse(model(x), y) ps = Flux.params(model) opt = ADAM() evalcb = () -> @show(loss(X_train_t, Y_train)) Flux.train!(loss, params(model), data, opt, cb = evalcb)
X_train_t is a
Y_train is a vector of length
The loss is very very high, and the prediction accuracy of testing data is almost zero.
Flux.jl, if I change the optimiser to gradient descent, it even didn’t converge.
I really don’t know why the training process from
Flux.jl is wrong, could you please give me a hint on what’s wrong with my code?
Thank you very much,