In attempting to create the most performant implementation of Andrej Karpathy’s excellent ShakespeareGPT example (as seen on YouTube), I decided to publish my work as a package. I hope some of you find it useful.
I’d love to get some real, critical feedback on my code style and to know how I can improve to better fit in with the Julia ecosystem. I really enjoy Julia and intend on releasing many more projects and tools. Thanks for your time.
Great idea, thanks for the contribution!
Just copying what was said on Slack before it gets erased: to achieve top performance, it’s a good idea to profile your code and see which parts take longest. At first glance, it seems like your untyped struct fields might result in type-instability.
If you need help solving that or other issues, I’m happy to contribute
Nice work and very clean/readable code. Typing the struct fields has also been done already.
From a bigger perspective, I wonder how your library compares to Transformers.jl and NNlib.jl? Are you planning to complement or extend them? Is your implementation faster?