Is Rust (and/or Python) the new high-level API (here ANN/GGUF/LLama example)

Palli · October 24, 2024, 10:21pm

This state-of-the-art library has Rust and Python API, but ironically no (C++) high-level one despite written in C++:

I’m guessing C-like C++ is kinda necessary for bindings to other [languages]

[TensorFlow, also written in C++, only had officially stable Python API (while Julia’s API better until unmaintained).]

If we want to call this library then we can for sure, using PythonCall.jl, also an option to call Rust’s API. So which would you prefer?

You might think why not use:

GGUF models: Llama 2, Llama 3, and Phi-3 (not all quantization variants may work)

Andrej Karpathy’s llama2.c format

Note there, it’s only the format, Karpathy’s excellent llama2.c isn’t actually used, nor would it do if used.

llama.cpp provides all the GGUF and all the quantization types, and I’m not sure there’s any real alternative.

So why is Llama[2].jl being made from scratch in pure Julia? It’s great that you can, but not really needed, or even wanted? I think we should reuse great code.

I think Julia could well be the high-level API for end-users (but also for other languages, i.e. replacing C++ as implementation language).

Until then, should we rather be wrapping Rust; or Python (in general, not just for this)?

Wrapping C++ is possible but famously annoying, and doesn’t really matter if you can or not if programs (increasingly?) do not provide C++ API. Rust also has some issues (or just this one solvable?), it can rearrange structs (good for it, for performance, bad for languages wrapping Rust, then you must forbid it, which is a possibility for “C-like API”; though in most cases not need? You just wrap API, not expose structs).

svilupp · October 26, 2024, 9:11pm

Just to throw it out there — there is GitHub - marcom/Llama.jl: Julia interface to llama.cpp, a C/C++ library for running language models which wraps llama.cpp. But it’s many versions behind, we should update the jll!

Topic		Replies	Views
LLaMA in Julia? Offtopic	13	3643	August 7, 2023
LlaMa2 architecture in Julia: llama2.jl (300 lines?) General Usage	2	522	August 1, 2023
Sequence language models in Julia Machine Learning	3	202	June 29, 2025
Rust-lang interaction New to Julia question , rust	5	7962	September 25, 2018
Interfacing rust? General Usage rust	6	4089	September 25, 2018

Is Rust (and/or Python) the new high-level API (here ANN/GGUF/LLama example)

Related topics