Seems like Julia doesn’t have much going on in terms of training or inference for LLMs in general.
For example, “Julia language llama gpt” returns this thread from 2023: LLaMA in Julia? - #12 by ImreSamu. It mentions GitHub - cafaxo/Llama2.jl: Julia package for inference and training of Llama-style language models as a pure-Julia implementation of LLAMA. As of today, the last commit was a year ago (42 commits total), the repo has 140 stars. IMO this is extremely low for an LLM project given the current hype. I don’t think the package is “bad”, I think very few people know about it or are using it. Meanwhile PyTorch-based HuggingFace Transformers is widely used for running LLMs.
GitHub - rai-llc/LanguageModels.jl: Load nanoGPT-style transformers in Julia. Code ported from @karpathy's llama2.c was last active 2 years ago and has 61 stars. Meanwhile GitHub - ggml-org/llama.cpp: LLM inference in C/C++ has 82 thousand stars. Why is it C++ instead of Julia? This is more of a rhetorical question…
A substantial LLM ecosystem (GitHub - chengchingwen/Transformers.jl: Julia Implementation of Transformer models, GitHub - chengchingwen/BytePairEncoding.jl: Julia implementation of Byte Pair Encoding for NLP, GitHub - chengchingwen/NeuralAttentionlib.jl: Reusable functionality for defining custom attention/transformer layers. etc) is maintained by chengchingwen. The Transformers repo has a respectable 500+ stars, the rest - barely a hundred, even though their functionality is absolutely fundamental. BytePairEncoding.jl is basically Julia’s only alternative to tiktoken (GitHub - openai/tiktoken: tiktoken is a fast BPE tokeniser for use with OpenAI's models., 15k stars), why isn’t it extremely popular? I’m using it right now, it seems to be the only way of accessing ChatGPTs tokenizers locally from Julia. Why doesn’t it have the thousand stars it deserves? Maybe Julia programmers aren’t that much into LLMs? Why? “Everyone else” seems to be extremely excited about them.
GitHub - brian-j-smith/Mamba.jl: Markov chain Monte Carlo (MCMC) for Bayesian analysis in julia is a package for Bayesian statistics (last commit 5 years ago), not language models. I couldn’t find any Julia implementation of Mamba the language model architecture.