I’ve made an embedding layer that should be able to embed a batched input. The forward pass seems to work great, I get back an array with the dimensions I expect.
The backward pass however, doesn’t work. I get an error saying: DimensionMismatch("tried to assign 10×32×16 array to 320×16 destination")
using Flux
struct Embedding
table
end
Embedding(voc_size, feature_size) = Embedding(param(Flux.glorot_normal(voc_size, feature_size)))
(e::Embedding)(x) = e.table[x, :]
function (e::Embedding)(x::AbstractArray{T, 2}) where {T}
out = e.table[x, :]
return(permutedims(out, (1, 3, 2)))
end
@Flux.treelike Embedding
model = Embedding(100, 16) # feature-size=16
input = rand(1:100, (10,32)) # input-size=10, batch-size=32
model(input) # input-size x feature-size x batch-size
loss(x) = sum(model(x)) # not actually meaningful, just as a test
loss(input)
Tracker.gradient(params(model)) do
loss(input)
end # DimensionMismatch("tried to assign 10×32×16 array to 320×16 destination")
Am I using unsupported functionality? I’m thinking the problem lies at the use of permutedims, but how can I solve this?
I’ve also tried to do this using Zygote (swapping Tracker.gradient
with Zygote.gradient
but then I get the same error.)
Thanks for any help!
Jules