LLaMA.cpp [1] has been getting a lot of attention on Hacker News [2] for its ability to run a Large Language Model (LLM) on any recent CPU with modest memory requirements. I’ve been meaning to get a better understanding of LLMs, so porting LLaMA over to Julia and being able to run it on my laptop seems like a good way to do that.
Has anyone else already started a similar project or have any thoughts? I briefly went through the C++ code and it looks fairly straightforward and a good fit for Julia from what I can tell.
[1] GitHub - ggerganov/llama.cpp: Port of Facebook's LLaMA model in C/C++
[2] Using LLaMA with M1 Mac and Python 3.11 | Hacker News