TransformerBlocks.jl - Simple, blazing fast, transformer components

Hello everyone!

I’m very happy to publish my first package:

In attempting to create the most performant implementation of Andrej Karpathy’s excellent ShakespeareGPT example (as seen on YouTube), I decided to publish my work as a package. I hope some of you find it useful.

The example code using TransformerBlocks.jl to implement ShakespeareGPT can be found here:
https://juliamltools.github.io/shakespeare-gpt

I’d love to get some real, critical feedback on my code style and to know how I can improve to better fit in with the Julia ecosystem. I really enjoy Julia and intend on releasing many more projects and tools. Thanks for your time.

15 Likes

Great idea, thanks for the contribution!
Just copying what was said on Slack before it gets erased: to achieve top performance, it’s a good idea to profile your code and see which parts take longest. At first glance, it seems like your untyped struct fields might result in type-instability.
If you need help solving that or other issues, I’m happy to contribute :slight_smile:

1 Like

Thanks! Just updated. Look OK?

1 Like

Looks good to me! I haven’t checked the rest but this is already a major step :+1:

1 Like

Nice work and very clean/readable code. Typing the struct fields has also been done already.
From a bigger perspective, I wonder how your library compares to Transformers.jl and NNlib.jl? Are you planning to complement or extend them? Is your implementation faster?

1 Like

TransformerBlocks.jl is more than 10 times faster than Transformers.jl in this example:
https://juliamltools.github.io/shakespeare-gpt

The latest I heard from the author of Transformers.jl is that some type instability issues may be causing issues there.

I haven’t yet compared TransformerBlocks.jl with NNlib.jl.

Thanks for the feedback!

3 Likes