Flux vs Keras time series prediction performance


I compare the prediction quality of two neural networks on a time series. One trained by Keras and one trained by Flux. Unfortunately I´m not able to achieve the same quality (see figure) and I don´t know why.


I used the same data and tried to use the same settings (e.g. batch size, optimizer, layers, activation functions, neurons).

Can anyone see what I´ve forgotten? Or explain why the performance is so different?

batch_size = 32
train_loader = Flux.Data.DataLoader(X', y_train', batchsize=batch_size)

num_epochs = 1000

opt = ADAM(0.001, (0.9, 0.999))
m = Chain(Dense(size(X, 2), 30), Dense(30, 30), Dense(30, 1))
loss(x, y) = Flux.mse(m(x), y)
ps = Flux.params(m)

@time Flux.train!(loss, ps, ncycle(train_loader, num_epochs), opt)

@time MyKerasWrapper.train_nn((X, y_train), (X, y_train), [30, 30, 1],["sigmoid", "sigmoid", "sigmoid"], num_epochs,[], 1000, String(@__DIR__) * raw"\keras_test_model_4_comparision_with_flux",
                                        1, "adam", 0.001, batch_size)


y_model_train_keras = Float64.(MyKerasWrapper.eval_nn(X, String(@__DIR__) * raw"\keras_test_model_4_comparision_with_flux"))

y_model_train_flux = Float64.(vec(m(X')))

time_vec = collect(0.1:0.1:length(y_train) * 0.1)

plot(time_vec, [y_train, y_model_train_flux, y_model_train_keras], xlabel = "t in s", ylabel = "y", label=["Measurement" "Flux-Model" "Keras-Model"])

Syncing hyperparameters between frameworks to get the same performance is generally painful to do imo. Sometimes it is faster to just retune for the wanted framework.

Check weight initialization and whether the mse is averaged or summed as these things are typically not the same between frameworks.


It looks like you are not using any non-linearity: Dense(30,30) is equivalent to Dense(30, 30, identity), you want sigmoid instead


Thanks, that was the problem. I thought, that sigmoid was the default activation function. Now it works :slight_smile:


1 Like