I’m trying to get familiar with neural networks and Flux by estimating a series of simple models. First, I can successfully estimate a linear model using Flux:
using Plots
using Flux
using Flux: @epochs
gridsize = 100;
dgp(x) = -12x+3;
X = collect(range(0,stop=10,length=gridsize));
Y = dgp.(X);
data = []
for i in 1:length(X)
push!(data, ([X[i]], Y[i]))
end
model = Chain(Dense(1,1))
loss(x, y) = Flux.mse(model(x), y)
opt = Descent(0.01)
ps = Flux.params(model)
@epochs 10 Flux.train!(loss, ps, data, opt)
# Plot.
plot(X,[Y model(X').data'],label=["DGP" "Model"])
With only a few iterations, the model does a pretty good job:
I’m running into problems trying to approximate a quadratic function. The code is largely the same:
using Plots
using Flux
using Flux: @epochs
gridsize = 100;
dgp(x) = x^2;
X = collect(range(0,stop=10,length=gridsize));
Y = dgp.(X);
data = []
for i in 1:length(X)
push!(data, ([X[i]], Y[i]))
end
Q = 10;
model = Chain(Dense(1,Q,σ),
Dense(Q,1,identity));
loss(x, y) = Flux.mse(model(x), y)
opt = Descent(0.01)
para = Flux.params(model)
@epochs 10 Flux.train!(loss, para, data, opt)
# Plot.
plot(X,[Y model(X').data'],label=["DGP" "Model"])
Theoretically, I should be able to represent the function f(x) = x^2
over my compact grid, and 10 hidden layers (i.e., Q = 10
in my code) should be sufficient for a fairly good approximation. Running this code, however, generates a very “flat” model:
I’ve tried different activation functions, changing the speed of the gradient descent, and a few other things, but I’m wondering if I’m doing something wrong within Flux. Thanks in advance for any help!