Sentence Embeddings using Transformers.jl

I’m trying to do sentence embeddings using a huggingface model similar to python example here: sentence-transformers/all-MiniLM-L6-v2 · Hugging Face.

So far I have this

using Transformers.HuggingFace
using Transformers.TextEncoders

sentTrans = hgf"sentence-transformers/all-MiniLM-L6-v2"

enc = sentTrans[1]
model = sentTrans[2]


sentences = [
    "This framework generates embeddings for each input sentence",
    "Sentences are passed as a list of string.",
    "The quick brown fox jumps over the lazy dog."
]

out = model(encode(enc,sentences))

out[3] is a 384 element vector for each sentence, which is what I expected to get, but the vectors don’t match what I get when I use the Python implementation.

I have a strong suspicion I’m just missing a step, looking for, and appreciative of, any guidance anyone may be able to offer.

Thanks.

2 Likes