|
--- |
|
language: |
|
- ko |
|
tags: |
|
- ocr |
|
widget: |
|
- src: https://raw.githubusercontent.com/ddobokki/ocr_img_example/master/g.jpg |
|
example_title: word1 |
|
- src: https://raw.githubusercontent.com/ddobokki/ocr_img_example/master/khs.jpg |
|
example_title: word2 |
|
- src: https://raw.githubusercontent.com/ddobokki/ocr_img_example/master/m.jpg |
|
example_title: word3 |
|
pipeline_tag: image-to-text |
|
license: apache-2.0 |
|
--- |
|
|
|
# korean trocr model |
|
- trocr λͺ¨λΈμ λμ½λμ ν ν¬λμ΄μ μ μλ κΈμλ ocr νμ§ λͺ»νκΈ° λλ¬Έμ, μ΄μ±μ μ¬μ©νλ ν ν¬λμ΄μ λ₯Ό μ¬μ©νλ λμ½λ λͺ¨λΈμ μ¬μ©νμ¬ μ΄μ±λ UNKλ‘ λμ€μ§ μκ² λ§λ trocr λͺ¨λΈμ
λλ€. |
|
- [2023 κ΅μκ·Έλ£Ή AI OCR μ±λ¦°μ§](https://dacon.io/competitions/official/236042/overview/description) μμ μ»μλ λ
Ένμ°λ₯Ό νμ©νμ¬ μ μνμμ΅λλ€. |
|
## train datasets |
|
AI Hub |
|
- [λ€μν ννμ νκΈ λ¬Έμ OCR](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=91) |
|
- [곡곡νμ λ¬Έμ OCR](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=88) |
|
|
|
## model structure |
|
- encoder : [trocr-base-stage1's encoder](https://huggingface.co/microsoft/trocr-base-stage1) |
|
- decoder : [KR-BERT-char16424](https://huggingface.co/snunlp/KR-BERT-char16424) |
|
|
|
## how to use |
|
|
|
```python |
|
from transformers import TrOCRProcessor, VisionEncoderDecoderModel, AutoTokenizer |
|
import requests |
|
import unicodedata |
|
from io import BytesIO |
|
from PIL import Image |
|
|
|
processor = TrOCRProcessor.from_pretrained("ddobokki/ko-trocr") |
|
model = VisionEncoderDecoderModel.from_pretrained("ddobokki/ko-trocr") |
|
tokenizer = AutoTokenizer.from_pretrained("ddobokki/ko-trocr") |
|
|
|
url = "https://raw.githubusercontent.com/ddobokki/ocr_img_example/master/g.jpg" |
|
response = requests.get(url) |
|
img = Image.open(BytesIO(response.content)) |
|
|
|
pixel_values = processor(img, return_tensors="pt").pixel_values |
|
generated_ids = model.generate(pixel_values, max_length=64) |
|
generated_text = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] |
|
generated_text = unicodedata.normalize("NFC", generated_text) |
|
print(generated_text) |
|
``` |