Introductory ML model in Flux

I am trying to trying to replicate Figure 8 (page 18) and Algorithm 6.1 (page 17) from this introductory ML paper for mathematicians. The figure for convenience is:

image

where the ten points are classified as cateogry A (red circles) or category B (blue crosses).

Here is how my flux model (complete MWE) looks like:

using Flux

# copy paste data points from the paper
x1 = [0.1,0.3,0.1,0.6,0.4,0.6,0.5,0.9,0.4,0.7]; # from the paper
x2 = [0.1,0.4,0.5,0.9,0.2,0.3,0.6,0.2,0.4,0.6];
y = [ones(1,5) zeros(1,5); zeros(1,5) ones(1,5)] # from the paper
 
 # convert to be used with Flux/Julia 
xs = [[x1[i], x2[i]] for i = 1:10]             
ys = [y[:, i] for i = 1:10] # convert to be used with Flux

# Flux Model where input in R^2 and output in R^2 as well
m = Chain(Dense(2, 3, σ), Dense(3, 2, σ), Dense(2, 2, σ))

# apply the model to the x data points to see how well this randomly initialized model classifies the points
modresults = m.(xs) 

# print the results
10-element Vector{Vector{Float64}}:
 [0.6655142490442801, 0.5706965426383223]
 [0.666123956314254, 0.5707553740872817]
 [0.6660863149694488, 0.5707991135524418]
 [0.6670868379883004, 0.5708697378084514]
 [0.6659266662507914, 0.5706888975327645]
 [0.6662432487462897, 0.5706847722527106]
 [0.6665849850562766, 0.570792851971794]
 [0.6663244954298975, 0.5705298786503195]
 [0.6662136266692837, 0.5707463831017299]
 [0.6667586892722243, 0.5707714286413034]

# classify the results into category A if F1(x) > F2(x), B otherwise where F1 and F2 are the first and second components of the output
class = map(r -> r[1] >= r[2] ? [1.0, 0.0] : [0.0, 1.0], modresults)

10-element Vector{Vector{Float64}}:
 [1.0, 0.0]
 [1.0, 0.0]
 [1.0, 0.0]
 [1.0, 0.0]
 [1.0, 0.0]
 [1.0, 0.0]
 [1.0, 0.0]
 [1.0, 0.0]
 [1.0, 0.0]
 [1.0, 0.0]

So obviously the randomly initialized model isn’t doing a very good job. Let’s train this model

datapts = zip(xs, ys) # collect the xs and ys for training purposes. 
loss(x, y) = Flux.Losses.mse(m(x), y)
ps = params(m)
Flux.train!(loss, ps, datapts, Descent(0.01))

but this dosn’t really do much. Applying the trained model on the input data dosn’t really change much.

modresults = m.(xs)  ## now, given the original data points, see how well it classifies them 
class = map(r -> r[1] >= r[2] ? [1.0, 0.0] : [0.0, 1.0], modresults)

#modresults
10-element Vector{Vector{Float64}}:
 [0.49853694974810886, 0.5012385889724634]
 [0.49882430021793656, 0.5011796138586837]
 [0.4990297863696163, 0.5012587103720679]
 [0.49931308843249256, 0.5011119262471514]
 [0.4985347432659953, 0.5011273015533314]
 [0.4985718587162263, 0.5010484679398993]
 [0.49898766820104984, 0.5011212037652792]
 [0.49829956801599506, 0.500842015490126]
 [0.4987827364676438, 0.5011429337398265]
 [0.4989026390464357, 0.5010455716217703]

# classification 
10-element Vector{Vector{Float64}}:
 [0.0, 1.0]
 [0.0, 1.0]
 [0.0, 1.0]
 [0.0, 1.0]
 [0.0, 1.0]
 [0.0, 1.0]
 [0.0, 1.0]
 [0.0, 1.0]
 [0.0, 1.0]
 [0.0, 1.0]

where the correct answer should be that half of them are labelled [1, 0] and the other half [0, 1]… what am i doing wrong?

Flux.train! runs through 1 epoch of training, that is it goes through the training data once. The matlab example linked uses about 999,999 more iterations, so I think you’d have more success by training for a little longer :slight_smile:

1 Like

You know I was doing that but only 1000 iterations and it wasn’t doing much. It led to me believe that the train function has internal iterations and so didn’t bother pasting it here.

You were right, increasing to 100,000 iterations did solve the issue.

1 Like