binhcode25 commited on
Commit
cd71211
1 Parent(s): 4efb231

Add new SentenceTransformer model.

Browse files
Files changed (3) hide show
  1. README.md +45 -0
  2. model.onnx +2 -2
  3. tokenizer.json +16 -2
README.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: light-embed
3
+ pipeline_tag: sentence-similarity
4
+ tags:
5
+ - sentence-transformers
6
+ - feature-extraction
7
+ - sentence-similarity
8
+
9
+ ---
10
+
11
+ # sbert-all-MiniLM-L12-v2-onnx
12
+
13
+ This is the ONNX version of the Sentence Transformers model sentence-transformers/all-MiniLM-L12-v2 for sentence embedding, optimized for speed and lightweight performance. By utilizing onnxruntime and tokenizers instead of heavier libraries like sentence-transformers and transformers, this version ensures a smaller library size and faster execution. Below are the details of the model:
14
+ - Base model: sentence-transformers/all-MiniLM-L12-v2
15
+ - Embedding dimension: 384
16
+ - Max sequence length: 128
17
+ - File size on disk: 0.12 GB
18
+
19
+ This ONNX model consists all components in the original sentence transformer model:
20
+ Transformer, Pooling, Normalize
21
+
22
+ <!--- Describe your model here -->
23
+
24
+ ## Usage (LightEmbed)
25
+
26
+ Using this model becomes easy when you have [LightEmbed](https://www.light-embed.net) installed:
27
+
28
+ ```
29
+ pip install -U light-embed
30
+ ```
31
+
32
+ Then you can use the model like this:
33
+
34
+ ```python
35
+ from light_embed import TextEmbedding
36
+ sentences = ["This is an example sentence", "Each sentence is converted"]
37
+
38
+ model = TextEmbedding('sentence-transformers/all-MiniLM-L12-v2')
39
+ embeddings = model.encode(sentences)
40
+ print(embeddings)
41
+ ```
42
+
43
+ ## Citing & Authors
44
+
45
+ Binh Nguyen / [email protected]
model.onnx CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:aa2f30fd904338a4871270dca3b59d9b05477605bf53fdb217b76b57843a31f4
3
- size 133203053
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:94acbe687695083e8e2ebbcb2b6ddc53eda64617e38471af3c08337660b1ff4d
3
+ size 133203111
tokenizer.json CHANGED
@@ -1,7 +1,21 @@
1
  {
2
  "version": "1.0",
3
- "truncation": null,
4
- "padding": null,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  "added_tokens": [
6
  {
7
  "id": 0,
 
1
  {
2
  "version": "1.0",
3
+ "truncation": {
4
+ "direction": "Right",
5
+ "max_length": 128,
6
+ "strategy": "LongestFirst",
7
+ "stride": 0
8
+ },
9
+ "padding": {
10
+ "strategy": {
11
+ "Fixed": 128
12
+ },
13
+ "direction": "Right",
14
+ "pad_to_multiple_of": null,
15
+ "pad_id": 0,
16
+ "pad_type_id": 0,
17
+ "pad_token": "[PAD]"
18
+ },
19
  "added_tokens": [
20
  {
21
  "id": 0,