File size: 592 Bytes
7c73423
 
 
 
 
 
8167cc2
7c73423
 
 
 
f2cec45
 
 
7c73423
 
 
 
 
 
f2cec45
 
 
 
 
 
 
 
 
 
6ef6bf4
 
 
 
 
 
 
f2cec45
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
---

title: Tokenizer Arena
emoji: 
colorFrom: red
colorTo: gray
sdk: gradio
sdk_version: 4.38.1
app_file: app.py
pinned: false
datasets:
  - cc100
tags:
  - tokenizer
short_description: Compare different tokenizers in char-level and byte-level.
---





Please visit our GitHub repo for more information: https://github.com/xu-song/tokenizer-arena


## Run gradio demo

```

python app.py

```



## Deploy to Huggingface

```sh

python compression_util.py  # cache compression

python character_util.py  # cache character

python stats/sample.py # ss

```