daulet25 commited on
Commit
cbf1997
1 Parent(s): bf68e56

update tokenizer to match latest release of command-nightly

Browse files
Files changed (3) hide show
  1. .gitattributes +0 -1
  2. README.md +27 -3
  3. tokenizer.json +2 -2
.gitattributes CHANGED
@@ -32,5 +32,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
32
  *.zip filter=lfs diff=lfs merge=lfs -text
33
  *.zst filter=lfs diff=lfs merge=lfs -text
34
  *tfevents* filter=lfs diff=lfs merge=lfs -text
35
- *.md filter=lfs diff=lfs merge=lfs -text
36
  *.json filter=lfs diff=lfs merge=lfs -text
 
32
  *.zip filter=lfs diff=lfs merge=lfs -text
33
  *.zst filter=lfs diff=lfs merge=lfs -text
34
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
35
  *.json filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,27 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:b8d84577259b158f02ef147d8780ea9705088abf4edcdf8d020b919d83418e01
3
- size 701
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ Cohere `command-nightly` tokenizer
6
+
7
+ This is the tokenizer for the Cohere `command-nightly` [chat](https://docs.cohere.com/reference/chat) model.
8
+
9
+ You can load it with the tokenizer library like this:
10
+
11
+ ```python
12
+ from tokenizers import Tokenizer
13
+
14
+ tokenizer = Tokenizer.from_pretrained("Cohere/command-nightly")
15
+ text = "Hellö World, this is my input string!"
16
+ enc = tokenizer.encode(text)
17
+ print("Encoded input:")
18
+ print(enc)
19
+
20
+ inv_vocab = {v: k for k, v in tokenizer.get_vocab().items()}
21
+ tokens = [inv_vocab[token_id] for token_id in enc.ids]
22
+ print("Tokens:")
23
+ print(tokens)
24
+
25
+ number_of_tokens = len(enc.ids)
26
+ print("Number of tokens:", number_of_tokens)
27
+ ```
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c2a15f7b2a2f647bb4b288e1b2faeb7b7e7afe55acf014cb266ca44f3c29af9a
3
- size 3163464
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5e07ab012ff6144f924acb0c46e8462ade9119d6375a7712a170dc7620291493
3
+ size 12777505