ssmits commited on
Commit
888636b
1 Parent(s): b550d04

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -1
README.md CHANGED
@@ -25,4 +25,33 @@ pipeline_tag: text-classification
25
  Embeddings version of the base model [ssmits/Falcon2-5.5B-multilingual](https://huggingface.co/ssmits/Falcon2-5.5B-multilingual/edit/main/README.md).
26
  The 'lm_head' layer of this model has been removed, which means it can be used for embeddings. It will not perform greatly, as it needs to be further fine-tuned, as shown by [intfloat/e5-mistral-7b-instruct](https://huggingface.co/intfloat/e5-mistral-7b-instruct).
27
  Additionaly, in stead of a normalization layer, the hidden layers are followed up by both a classical weight and bias 1-dimensional array of 4096 values.
28
- Further research needs to be conducted if this architecture will fully function when adding a classification head in combination with utilizing the transformers library.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  Embeddings version of the base model [ssmits/Falcon2-5.5B-multilingual](https://huggingface.co/ssmits/Falcon2-5.5B-multilingual/edit/main/README.md).
26
  The 'lm_head' layer of this model has been removed, which means it can be used for embeddings. It will not perform greatly, as it needs to be further fine-tuned, as shown by [intfloat/e5-mistral-7b-instruct](https://huggingface.co/intfloat/e5-mistral-7b-instruct).
27
  Additionaly, in stead of a normalization layer, the hidden layers are followed up by both a classical weight and bias 1-dimensional array of 4096 values.
28
+ Further research needs to be conducted if this architecture will fully function when adding a classification head in combination with utilizing the transformers library.
29
+
30
+ ## Inference
31
+ ```python
32
+ from sentence_transformers import SentenceTransformer
33
+ import torch
34
+
35
+ # 1. Load a pretrained Sentence Transformer model
36
+ model = SentenceTransformer("ssmits/Falcon2-5.5B-multilingual-embed-base")
37
+
38
+ # The sentences to encode
39
+ sentences = [
40
+ "The weather is lovely today.",
41
+ "It's so sunny outside!",
42
+ "He drove to the stadium.",
43
+ ]
44
+
45
+ # 2. Calculate embeddings by calling model.encode()
46
+ embeddings = model.encode(sentences)
47
+ print(embeddings.shape)
48
+ # (3, 4096)
49
+
50
+ # 3. Calculate the embedding similarities
51
+ # Using torch to compute cosine similarity matrix
52
+ similarities = torch.nn.functional.cosine_similarity(embeddings.unsqueeze(0), embeddings.unsqueeze(1), dim=2)
53
+ print(similarities)
54
+ # tensor([[1.0000, 0.7120, 0.5937],
55
+ # [0.7120, 1.0000, 0.5925],
56
+ # [0.5937, 0.5925, 1.0000]])
57
+ ```