monologg
/

kobigbird-bert-base

Inference Endpoints

Model card Files Files and versions Community

monologg commited on Oct 20, 2021

Commit

adb10ae

•

1 Parent(s): fd0295b

update readme

Files changed (1) hide show

README.md +37 -0

README.md ADDED Viewed

	@@ -0,0 +1,37 @@

+---
+language: ko
+---
+# KoBigBird
+Pretrained BigBird Model for Korean (**kobigbird-bert-base**)
+## About
+BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences.
+BigBird relies on **block sparse attention** instead of normal attention (i.e. BERT's attention) and can handle sequences up to a length of 4096 at a much lower compute cost compared to BERT. It has achieved SOTA on various tasks involving very long sequences such as long documents summarization, question-answering with long contexts.
+Model is warm started from Korean BERT’s checkpoint.
+## How to use
+WARN: Please use `BertTokenizer` instead of `BigBirdTokenizer`.
+```python
+from transformers import AutoModel, AutoTokenizer
+# by default its in `block_sparse` mode with num_random_blocks=3, block_size=64
+model = AutoModel.from_pretrained("monologg/kobigbird-bert-base")
+# you can change `attention_type` to full attention like this:
+model = AutoModel.from_pretrained("monologg/kobigbird-bert-base", attention_type="original_full")
+# you can change `block_size` & `num_random_blocks` like this:
+model = AutoModel.from_pretrained("monologg/kobigbird-bert-base", block_size=16, num_random_blocks=2)
+tokenizer = AutoTokenizer.from_pretrained("monologg/kobigbird-bert-base")
+text = "한국어 BigBird 모델을 공개합니다!"
+encoded_input = tokenizer(text, return_tensors='pt')
+output = model(**encoded_input)
+```