bwang0911 commited on
Commit
64bc1c6
1 Parent(s): 8ba26fe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -0
README.md CHANGED
@@ -1783,6 +1783,36 @@ embeddings = finetuner.encode(
1783
  print(finetuner.cos_sim(embeddings[0], embeddings[1]))
1784
  ```
1785
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1786
  ## Fine-tuning
1787
 
1788
  Please consider [Finetuner](https://github.com/jina-ai/finetuner).
 
1783
  print(finetuner.cos_sim(embeddings[0], embeddings[1]))
1784
  ```
1785
 
1786
+ Use directly with Huggingface Transformers:
1787
+
1788
+ ```python
1789
+ import torch
1790
+ from transformers import AutoModel, AutoTokenizer
1791
+
1792
+
1793
+ def mean_pooling(model_output, attention_mask):
1794
+ token_embeddings = model_output[0]
1795
+ input_mask_expanded = (
1796
+ attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
1797
+ )
1798
+ return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(
1799
+ input_mask_expanded.sum(1), min=1e-9
1800
+ )
1801
+
1802
+ sentences = ['how is the weather today', 'What is the current weather like today?']
1803
+
1804
+ # Load model from HuggingFace Hub
1805
+ tokenizer = AutoTokenizer.from_pretrained('jinaai/jina-embedding-l-en-v1')
1806
+ model = AutoModel.from_pretrained('jinaai/jina-embedding-l-en-v1')
1807
+
1808
+ with torch.inference_mode():
1809
+ encoded_input = tokenizer(
1810
+ sentences, padding=True, truncation=True, return_tensors='pt'
1811
+ )
1812
+ model_output = model.encoder(**encoded_input)
1813
+ embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
1814
+ ```
1815
+
1816
  ## Fine-tuning
1817
 
1818
  Please consider [Finetuner](https://github.com/jina-ai/finetuner).