File size: 363 Bytes
d547746
 
 
 
 
 
 
 
 
fe24670
 
 
 
 
b3af060
fe24670
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
---
language: 
  - amh
tags:
- Amharic
- Word Piece Tokenizer
- Tokenizer
license: cc-by-4.0
---
```
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("israel/AmhWordPieceTokenizer")

encoding = tokenizer.encode("ኮሌጁ ቢያስተምርም ወደስራ የሚመድባቸው መንግስት ነው abcs")

encoding.tokens

```