How to load models from HuggingFace with Transformers.jl

Tomas_Pevny · June 19, 2023, 7:29pm

Hi All,

I quite like the Transformers.jl. I do not aspire to train or finetune LLMs, but I am more interested in learning prompts. I am a bit lost with loading models from the HuggingFace.
Here are few questions:

Why are models named with symbols? e.g. :gpt2 ?
How do I find the names of models? For example “gpt2:lmheadmodel”?
How do I find the model is supported? Is there any tutorial, how to add unsupported model?
Has anyone tried to run the model on multiple GPUs, if it does not fit one?

Let me give you a concrete example.
Let’s say that I would load this model EleutherAI/gpt-neo-125m · Hugging Face .

I can load the tokenizer as

hgf"EleutherAI/gpt-neo-125m:tokenizer"

but I do not know, which heads the model offer.
I figured out that I can probably load everything as

hgf"EleutherAI/gpt-neo-125m"

which seems to return a tuple (tokenizer, model) and “ditch” the naming.

Nevertheless, questions 1,2, and 4 remains. The lib is great. So much hard work and so nice.

Tomas_Pevny · June 20, 2023, 9:57am

The loading of model using hgf"EleutherAI/gpt-neo-125m" is not complete, as it is missing a cls head used to compute logits of tokens. The problem is obviously in loading, since hgf"gpt2" loads the tokenizer and the model, but does not load cls layer which computes logits. Contrary, if I load the gpt2 model as hgf"gpt2:lmheadmodel", then the model contains a cls layers computing logits.

But this does not work

hgf"EleutherAI/gpt-neo-125m:lmheadmodel"
ERROR: Model gpt_neo doesn't support this kind of task: lmheadmodel

Topic		Replies	Views
BERT models from huggingface - Transformers.jl Machine Learning package	1	1199	July 15, 2021
Loading a trained model in Transformers.jl General Usage question , flux , transformers	0	366	September 25, 2023
Using Transformers.jl for "is next sentence" New to Julia	2	555	March 24, 2021
16 bit float on Transformers.jl General Usage transformers	7	584	December 28, 2023
Downloading transformer weights General Usage tensorflow , flux , machine-learning , transformers	0	46	November 21, 2024

How to load models from HuggingFace with Transformers.jl

Related topics