Simple NLP in Flux with Embedding Layer

I’m trying to learn the use of embedding layers by the way of 10 string documents labelled as positive or negative.

StringDocs = ["well done", "good work", "great effort", "nice work", "excellent", "weak", "poor effort",  "not good", "poor work", "could have done better"]
y = [1,1,1,1,1,0,0,0,0,0]
pad_size=4
N = 10

Each word in the document (vocab) is represented by an integer and after some code to prepare the training data x becomes.

(Note it is transposed as an ‘input’ x N matrix for Flux).

x = [  0   0  0   0  0   0   0   0   0  2 ;
       0   0  0   0  0   0   0   0   0  8 ;
      13   6  7   9  0   0  11  10  11  3 ;
       3  14  4  14  5  12   4   6  14  1 ]

The rest of the code is below. I was hoping the embedding layer (W) would change and learn after every epoch but it’s not changing? I’ve been stuck on this for a while and need a few pointers. Would be grateful for any pointers or examples that may help.

data = [(x, y)]

W = param(Flux.glorot_normal(8, 51))
max_features, vocab_size = size(W)

one_hot_matrix=Flux.onehotbatch(reshape(x, pad_size*N), 0:vocab_size-1)

m = Chain(x -> W * one_hot_matrix,
          x -> reshape(x, max_features, pad_size, N),
          x -> mean(x, dims=2),
          x -> reshape(x, 8, 10),
          Dense(8, 1),
)

# if I add softmax above the loss doesn't change

loss(x, y) = Flux.mse(m(x), y)
optimizer = Flux.Descent(0.001)

for epoch in 1:10
    Flux.train!(loss, Flux.params(m), data, optimizer)
    println("loss=",loss(x, y).data)
end
show(Flux.params(m))

The output is : -

loss=3.7767549
loss=3.7263222
loss=3.6780336
loss=3.6317985
loss=3.5875306
loss=3.5451438
loss=3.5045598
loss=3.4657013
loss=3.4284942
loss=3.39287
Params([Float32[0.419828 -0.139223 -0.225595 0.0708142 0.232704 0.0907047 0.707192 -0.167613] (tracked), Float32[0.144213] (tracked)])

I was expecting to also see 8x51 params for the embedding layer.
I could be close or I could be way off but I can’t find examples close enough to help me progress.

params in Flux works only with structures that have been made @treelike; this is why it is not catching the embedding layer params. To make it work you need to change it to something like this:

struct EmbeddingLayer
   W
   EmbeddingLayer(mf, vs) = new(param(Flux.glorot_normal(mf, vs)))
end
@Flux.treelike EmbeddingLayer
(m::EmbeddingLayer)(x) = m.W * Flux.onehotbatch(reshape(x, pad_size*N), 0:vocab_size-1)

m = Chain(EmbeddingLayer(max_features, vocab_size),
          x -> reshape(x, max_features, pad_size, N),
          x -> mean(x, dims=2),
          x -> reshape(x, 8, 10),
          Dense(8, 1),
)

The rest of your snippet should stay the same.

Hope this helps.

Thanks so much @tanhevg I got it going with your help :grinning:!!

I had made a couple of other rookie errors which I corrected in this blog post.

If you stumble on this page grappling with something similar have a read.

1 Like