t5-v1_1-large-ko / README.md
franknoh's picture
Update README.md
c4acd53
metadata
language: ko
license: apache-2.0

team-lucid/t5-v1_1-large-ko

Google's T5 Version 1.1 that trained on korean corpus

t5-v1_1-large-ko์€ ํ•œ๊ตญ์–ด ์ฝ”ํผ์Šค์—์„œ ํ•™์Šต๋œ t5 v1.1 ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

OOV์„ ๋ง‰๊ธฐ ์œ„ํ•ด BBPE๋ฅผ ์‚ฌ์šฉํ•˜์˜€์œผ๋ฉฐ, HyperCLOVA์—์„œ ํ˜•ํƒœ์†Œ ๋ถ„์„์ด ์„ฑ๋Šฅ์„ ๋†’ํžˆ๋Š”๋ฐ ๋„์›€์ด ๋˜๋Š” ๊ฒƒ์„ ๋ณด๊ณ  ํ† ํฌ๋‚˜์ด์ € ํ•™์Šต ๊ณผ์ •์—์„œ MeCab์„ ์ด์šฉํ•ด ํ˜•ํƒœ์†Œ๊ฐ€ ์ด์ƒํ•˜๊ฒŒ ํ† ํฐํ™”๋˜์ง€ ์•Š๋„๋ก ํ•˜์˜€์Šต๋‹ˆ๋‹ค.

์ด ์—ฐ๊ตฌ๋Š” ๊ตฌ๊ธ€์˜ TPU Research Cloud(TRC)๋ฅผ ํ†ตํ•ด ์ง€์›๋ฐ›์€ Cloud TPU๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

Usage

from transformers import AutoTokenizer, T5ForConditionalGeneration

tokenizer = AutoTokenizer.from_pretrained('team-lucid/t5-v1_1-large-ko')
model = T5ForConditionalGeneration.from_pretrained('team-lucid/t5-v1_1-large-ko')