Sentence Transformers integration
#4
by
tomaarsen
HF staff
- opened
Hello!
Pull Request overview
- Add Sentence Transformers integration.
Details
This PR adds proper support in Sentence Transformers, i.e. the package often used in third party embedding applications. It abstracts away a lot of the transformers
code from the user, and instead hides it in the configuration. As a result, the user can just use:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Snowflake/snowflake-arctic-embed-xs")
queries = ['what is snowflake?', 'Where can I get the best tacos?']
documents = ['The Data Cloud!', 'Mexico City of Course!']
query_embeddings = model.encode(queries, prompt_name="query")
document_embeddings = model.encode(documents)
instead of manually loading both the model and the tokenizer, adding the query prompt themselves, computing the token embeddings & then taking the CLS embedding and then doing normalization.
P.s. Sentence Transformers is being maintained by Hugging Face.
- Tom Aarsen
tomaarsen
changed pull request status to
open
spacemanidol
changed pull request status to
merged