File size: 363 Bytes
d547746 fe24670 b3af060 fe24670 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
---
language:
- amh
tags:
- Amharic
- Word Piece Tokenizer
- Tokenizer
license: cc-by-4.0
---
```
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("israel/AmhWordPieceTokenizer")
encoding = tokenizer.encode("ኮሌጁ ቢያስተምርም ወደስራ የሚመድባቸው መንግስት ነው abcs")
encoding.tokens
``` |