Flux unable to differentiate an embedding layer

merckxiaan · July 6, 2019, 11:41am

I’ve made an embedding layer that should be able to embed a batched input. The forward pass seems to work great, I get back an array with the dimensions I expect.

The backward pass however, doesn’t work. I get an error saying: DimensionMismatch("tried to assign 10×32×16 array to 320×16 destination")

using Flux

struct Embedding
    table
end

Embedding(voc_size, feature_size) = Embedding(param(Flux.glorot_normal(voc_size, feature_size)))

(e::Embedding)(x) = e.table[x, :]

function (e::Embedding)(x::AbstractArray{T, 2}) where {T}
    out = e.table[x, :]
    return(permutedims(out, (1, 3, 2)))
end

@Flux.treelike Embedding

model = Embedding(100, 16) # feature-size=16
input = rand(1:100, (10,32)) # input-size=10, batch-size=32

model(input) # input-size x feature-size x batch-size

loss(x) = sum(model(x)) # not actually meaningful, just as a test
loss(input)

Tracker.gradient(params(model)) do
    loss(input)
end  # DimensionMismatch("tried to assign 10×32×16 array to 320×16 destination")

Am I using unsupported functionality? I’m thinking the problem lies at the use of permutedims, but how can I solve this?
I’ve also tried to do this using Zygote (swapping Tracker.gradient with Zygote.gradient but then I get the same error.)

Thanks for any help!
Jules

simeonschaub · July 6, 2019, 12:14pm

Could you provide a self-contained example? I’m not able to run it, since glorot_normal, @treelike and test are not defined.

merckxiaan · July 6, 2019, 12:22pm

Whoops, sorry, now it should run I believe.

merckxiaan · July 11, 2019, 10:11am

@mcabbott created a pr that should fix this: Non-scalar getindex by mcabbott · Pull Request #256 · FluxML/Zygote.jl · GitHub
Thanks!

Topic		Replies	Views
Julia/Flux creating a model correctly - using Chain Embedding layer reshaping & Dense layers Machine Learning question , embedding , flux	3	780	February 7, 2023
Flux: Embeddings on GPU Machine Learning gpu , flux	5	1027	February 28, 2021
Error dimensions mismatch"A has dimension (83,5) "but B has dimensions (83,5)")"when using FLux package in julia New to Julia question	3	746	March 22, 2022
Flux: Dimension mismatch error New to Julia question , flux	1	488	March 22, 2024
Making Word Embedding Layer in Julia General Usage flux , nlp	2	652	June 15, 2022

Flux unable to differentiate an embedding layer

Related topics