Sentence Similarity
sentence-transformers
PyTorch
Transformers
English
t5
text-embedding
embeddings
information-retrieval
beir
text-classification
language-model
text-clustering
text-semantic-similarity
text-evaluation
prompt-retrieval
text-reranking
feature-extraction
English
Sentence Similarity
natural_questions
ms_marco
fever
hotpot_qa
mteb
Eval Results
is INSTRUCTOR embeddings compatible with LLAMA2?
#16
by
iAFuisMe1234
- opened
what are the LLM compatible with INSTRUCTOR Embeddings are there any git links with sample code.
Hello,
Better late than never
If you want to use this in RAG workflow you can! Using this model and compute the embedding of all your documents. Calculate the distances between the question and the documents' embeddings. Retrieve the documents with the smallest distance. Pass it as context within your prompt into the llm.
Example using faiss
# pip install sentence_transformers InstructorEmbedding faiss-cpu
from InstructorEmbedding import INSTRUCTOR
model = INSTRUCTOR('hkunlp/instructor-large')
# Get your documents
sentences = [
"(A) Call Mom \@Phone +Family",
"(A) Schedule annual checkup +Health",
"(B) Outline chapter 5 +Novel \@Computer",
"(C) Add cover sheets \@Office +TPSReports",
"Plan backyard herb garden \@Home",
"Pick up milk \@GroceryStore",
"Research self-publishing services +Novel \@Computer",
"x Download Todo.txt mobile app \@Phone"
]
instruction = "Represent the todo.txt item for retrieving it"
# create the embeddings for the documents
embeddings = model.encode([[instruction,sentence] for sentence in sentences])
# create the embeddings for the prompt
question = "I'm at the store what do i have to buy"
instruction_question = "Represent the question for retrieving a todo.txt item"
embeddings_question = model.encode([[instruction_question, question]])
# using faiss to store and compute the distances (should work with any vector database
import faiss # make faiss available
index = faiss.IndexFlatL2(len(embeddings[0])) # build the index
index.add(embeddings) # add vectors to the index
k = 1 # we want to see nearest neighbor
_, I = index.search(embeddings_question, k) # actual search
context = sentences[int(I[0])]
# build your prompt
prompt = f"using this context \"{context}\" answer the user question: {question}"
# creating a dummy llm for example
def dummy_llm(prompt):
"""just return the prompt"""
return prompt
print(dummy_llm(prompt))