[ANN] Transformers.jl

announcement
#1

Hey guys, I want to announce this package for building Transformer related model with Flux.jl.
GitHub: https://github.com/chengchingwen/Transformers.jl

With the package you can build this model


with this syntax

using Transformers
using Transformers.Basic

encoder = Stack(
    @nntopo(e → pe:(e, pe) → x → x → $N),
    PositionEmbedding(512),
    (e, pe) -> e .+ pe,
    Dropout(0.1),
    [Transformer(512, 8, 64, 2048) for i = 1:N]...
)

decoder = Stack(
    @nntopo((e, m, mask):e → pe:(e, pe) → t → (t:(t, m, mask) → t:(t, m, mask)) → $N:t → c),
    PositionEmbedding(512),
    (e, pe) -> e .+ pe,
    Dropout(0.1),
    [TransformerDecoder(512, 8, 64, 2048) for i = 1:N]...,
    Positionwise(Dense(512, length(labels)), logsoftmax)
)

These are all compatible with the current Flux’s API. You can find more information on the README.

I’ll also be working on the BERT model as one of the JSoC 2019 project. If you are interested, please take a look and give it a try.

8 Likes