--- title: Tokenizer Arena emoji: ⚔ colorFrom: red colorTo: gray sdk: gradio sdk_version: 4.38.1 app_file: app.py pinned: false datasets: - cc100 tags: - tokenizer short_description: Compare different tokenizers in char-level and byte-level. --- Please visit our GitHub repo for more information: https://github.com/xu-song/tokenizer-arena ## Run gradio demo ``` python app.py ``` ## Deploy to Huggingface ```sh python compression_util.py # cache compression python character_util.py # cache character python stats/sample.py # ss ```