Apologies for this question since I’ve had a lot of trouble producing a MWE (which I don’t have).
I’m working on an RNN using text data (each observation is a sentence - each word in each sentence is mapped to a word embedding). The problem is classification into six classes. The issue I am having I think relates to using the Flux.DataLoader type.
The basic setup is…
train_data = Flux.DataLoader((data=train, label=tlabels), batchsize=42, shuffle=true)
# train_data.data.data is 1x357, where each item in the vector is (vocab_size)x(sentence_length) and is a matrix of one-hot vectors
# train_data.data.label is 1x357, where each item in the vector is (6,) and is one-hot encoded
function two_layers(args)
# Embed is a layer I defined to take one-hot vectors to word embeddings of length N=50
scanner = Chain(Embed(args.vocab, args.embed_len, args.emb_table), LSTM(args.embed_len, args.N), LSTM(args.N, args.N))
encoder = Dense(args.N, args.classes, identity)
return scanner, encoder
end
function model(x, scanner, encoder)
state = scanner(x)[:,end]
reset!(scanner)
encoder(state)
end
ps = params(scanner, encoder)
loss(x,y) = Flux.logitcrossentropy(model(x, scanner, encoder), y)
When I do:
for (x,y) in train_data
gs = Flux.gradient(()->loss(x,y), ps)
Flux.update!(opt, ps, gs)
end
I get the error:
ERROR: LoadError: MethodError: no method matching (::Flux.LSTMCell{Matrix{Float32}, Vector{Float32}, Tuple{Matrix{Float32}, Matrix{Float32}}})(::Tuple{Matrix{Float32}, Matrix{Float32}}, ::Matrix{Real})
I this error find confusing b/c I don’t know what ::Matrix{Real}
is being referred to at the end of the error message.
I expected (x,y)
to iterate over batches of data and labels (which printing some type information about x, y
inside the loop suggests it is doing).
for (x,y) in train_data
@info "type of (x,y)" typeof((x,y))
@info "type of x", typeof(x)
@info "size x", size(x)
@info "size x[1]", size(x[1])
@info "typeof y", typeof(y)
@info "size y", size(y)
@info "size y[1]", size(y[1])
end
Info: type of (x,y)
└ typeof((x, y)) = Tuple{Matrix{Flux.OneHotArray{UInt32, 733, 1, 2, Vector{UInt32}}}, Matrix{Flux.OneHotArray{UInt32, 6, 0, 1, UInt32}}}
[ Info: ("type of x", Matrix{Flux.OneHotArray{UInt32, 733, 1, 2, Vector{UInt32}}})
[ Info: ("size x", (1, 42)) # 42 is the batch size
[ Info: ("size x[1]", (733, 18)) # 18 is the length of the sentence
[ Info: ("typeof y", Matrix{Flux.OneHotArray{UInt32, 6, 0, 1, UInt32}})
[ Info: ("size y", (1, 42))
[ Info: ("size y[1]", (6,))
I also can do loss(x[1], y[1])
and get a number. So I think my problems might be solved by…
- Figuring out what that extra
::Matrix{Real}
is and where it comes from? - Broadcasting over the DataLoader type? (I have tried
gradient(()->loss.(x,y)
but that also gives an error.)
I’m definitely missing something but I can’t figure out what! I appreciate any help and apologize for not having an MWE…