Is there an implementation of the attention mechanism in Flux.jl?

Azamat · July 29, 2019, 9:39pm

Specifically, I’m interested in the attention mechanism described in Listen, Attend and Spell work (where it’s referred to as AttentionContext), but more generally any other kind of attention’s implementation would be quite useful to take a look at.

fborda · July 29, 2019, 10:03pm

In terms of attention, the Transformer is probably interesting to check, though there are probably simpler attention models in Flux around.

https://github.com/chengchingwen/Transformers.jl

merckxiaan · July 30, 2019, 9:55am

I’ve got a regular old seq2seq implementation here.
The problem is that it doesn’t work very well… So it might be risky to use it as a reference.
If you do happen to spot a mistake I’d be happy to hear about it!

MacKenzieHnC · September 22, 2020, 9:32pm

I’m digging through your code right now. Thank you for the work you put into this.

The only problem I’ve found so far is a typo on the ipynb: “Esentially, the encoder outputs and the hidden state of the decoder are used to a context vector”

Some word is missing between “used to” and “a” and I don’t know what it is.

merckxiaan · September 23, 2020, 6:29am

Hey @MacKenzieHnC
Thanks for taking a look!
The sentence should have been this, I think:

Esentially, the encoder outputs and the hidden state of the decoder are used to create a context vector

It’s been some time since I’ve worked on this, the Flux api has changed quite a lot (most notably the AD engine changed to Zygote).
It definitely would be interesting to see this model implemented in the up-to-date version of Flux.
I might do this when I get the time… but I’m notoriously slow.

Volker · September 23, 2020, 6:49am

Maybe this links are interesting for you:

Topic		Replies	Views
Flux seq2seq Machine Learning question	12	3803	January 5, 2019
Julia Implementation of Transformer Neural Network Model Machine Learning flux	3	1656	April 19, 2019
[ANN] Transformers.jl Package Announcements announcement	6	1983	February 18, 2020
Flux: Machine Learning with Julia Machine Learning package , announcement	8	7909	March 3, 2017
Flux.jl RNN performance Machine Learning	11	3173	October 28, 2018

Is there an implementation of the attention mechanism in Flux.jl?

Related topics