OK… I’ve had time for checking Flux. I think I’ve made it work by “cheating” and looking at other threads, but 3 questions remain in my “example” for dummies…
a. My data are generated from y_\mathrm{d}=\sin(x_\mathrm{d}) with x_\mathrm{d}\in [-\frac{\pi}{2},\frac{\pi}{2}]. I generate N=100 data using range
. [Yes, I know this is stupidly simple, but it clarifies the understanding…]
b. My understanding is that Flux
needs data in the following form (at least for the Dense
layers…): data = [(xd,yd)]
where \mathrm{xd} \in \mathbb{T}^{n_x \times N} and \mathrm{yd} \in \mathbb{T}^{n_y \times N} – here, I have used \mathbb{T} to denote the type — I use Float64
.
I generate the data as follows:
# Packages
using Flux
using Plots; pyplot()
using Statistics
# Data
x_d = reshape(collect(range(-pi/2,pi/2,length=100)),1,100)
y_d = sin.(x_d)
plot(x_d',y_d') # just to check
data = [(x_d,y_d)]
c. I want to start with a single layer, thus with n_x = 1 inputs and n_y = 1 outputs, and the \tanh nonlinear output mapping. In other words, y = \tanh(wx+b) which has 2 parameters (w,b). This should allow for an ok first example “for dummies”.
d. My impression is that with the basic layer Dense(nx,ny,sigma)
where sigma
is the function for the output nonlinearity, I can set up the problem as follows — including (i) model mod
which is \tanh(wx+b) , (ii) parameter set par
which is (w,b) — par
also keeps track of the model, (iii) a fitting function loss
which is least squares, (iv) parameter optimization algorithm opt
, and (v) updating the parameters one time:
# Set up Flux problem
mod = Dense(1,1,tanh)
par = params(mod)
loss(x, y) = mean((mod(x).-y).^2)
opt = ADAM(0.002, (0.99, 0.999))
# One update of parameters
Flux.train!(loss,par,data,opt)
e. I can set up a sequence of (say, 1000) updates with command:
@Flux.epochs 1000 Flux.train!(loss,par,data,opt)
f. At any time, I can read the parameters par
and check the fitting (loss
) by commands:
par
loss(x_d,y_d)
3 remaining problems
- Flux seems to generate Float32 data.
- Can I change this to using Float64 somehow?
- Flux responds with tracked arrays, e.g.:
julia> typeof(mod(x_d))
TrackedArray{…,Array{Float32,2}}
- … so: how can I convert this to untracked arrays so that I can plot the model mapping:
plot(x_d,mod(x_d))
(which doesn’t work because mod(x_d)
is a TrackedArray
…)
- Coming from outside of the Machine Learning community, the term
Epoch
sounds weird.
- Is an
Epoch
simply a major iteration in the parameter update scheme?
OK… answers to questions 1, 2, 3 would clarify basic use of Flux, and should make it possible for me to move on to more interesting problems. […including having data with noise, splitting data between training and validation sets, chaining layers, multivariable problems, etc., etc.]