Hey guys, I want to announce this package for building Transformer related model with Flux.jl.
With the package you can build this model
with this syntax
using Transformers using Transformers.Basic encoder = Stack( @nntopo(e → pe:(e, pe) → x → x → $N), PositionEmbedding(512), (e, pe) -> e .+ pe, Dropout(0.1), [Transformer(512, 8, 64, 2048) for i = 1:N]... ) decoder = Stack( @nntopo((e, m, mask):e → pe:(e, pe) → t → (t:(t, m, mask) → t:(t, m, mask)) → $N:t → c), PositionEmbedding(512), (e, pe) -> e .+ pe, Dropout(0.1), [TransformerDecoder(512, 8, 64, 2048) for i = 1:N]..., Positionwise(Dense(512, length(labels)), logsoftmax) )
These are all compatible with the current Flux’s API. You can find more information on the README.
I’ll also be working on the BERT model as one of the JSoC 2019 project. If you are interested, please take a look and give it a try.