My journey training an LLM from scratch in Julia (and why I see huge potential)

I started training a language model from scratch in Julia. Not using pre-built libraries for the core - building my own BPE tokenizer, my own training loop, facing hallucinations, and rebuilding.

I tried Flux and Lux. Both have strengths, but also critical weaknesses (CUDA conflicts, design limitations). After a long struggle, I found a different path that worked.

PythonCall played a key role, bridging Julia to Python’s ecosystem when needed. But Julia itself was the heart of the project.

What I learned is that Julia is not just “another language”. It is a platform for real understanding. If the community focuses on its unique strengths (speed, metaprogramming, Python interop), I believe Julia can surpass many expectations.

I am not sharing technical details now. But I wanted to confirm: Julia is inspiring. With more work, it can become much, much more.
“I am sharing a screenshot as a proof of concept. The full code is not open-source at this stage. I want to document it properly first. I may share it later. I hope you understand and respect that.”
#machinelearning