I currently do a lot of ML in Python using Tensorflow, which is working fine but Julia seems to be a bit more then fine, so I’m experimenting with Flux as an alternative. My first step is just to train a simple feed-forward NN on a relatively small dataset (2000 samples) using a simple MSE. In tensorflow this works fine and my MSE goes down to 10^-5 (synthetic data without noise so no overfitting) but somehow in Flux I can’t get it past 0.03. My code is below: does anyone have any idea why it’s not going past 0.035?
X = [[x, t] for x in data["x"] for t in data["t"]] X = hcat(X...); y = reshape(real(data["usol"]), (1, length(data["usol"]))) idx = randperm!(collect(1:length(y))); X_train = X[:, idx][:, 1:2000]; y_train = y[idx][1:2000]; dataset = [(X_train, y_train)]; model = Chain(Dense(2, 20, tanh), Dense(20, 20, tanh), Dense(20, 20, tanh), Dense(20, 20, tanh), Dense(20, 20, tanh), Dense(20, 20, tanh), Dense(20, 1)) ps = params(model); loss(x, y) = mean((model(x).-y).^2) opt = ADAM(0.002, (0.99, 0.999)) evalcb() = @show(loss(X_train, y_train)) @Flux.epochs 5000 Flux.train!(loss, ps, dataset, opt, cb = Flux.throttle(evalcb, 5))