flax-community
/

bengali-t5-base

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

sbmaruf commited on Jul 19, 2021

Commit

899ac4b

•

1 Parent(s): ce1ba25

Update README.md

Files changed (1) hide show

README.md +22 -0

README.md CHANGED Viewed

@@ -6,6 +6,28 @@
 The model is trained on around ~11B tokens (64 size batch, 512 tokens, 350k steps).
 ## Proposal
 - [Project Proposal](https://discuss.huggingface.co/t/pretrain-t5-from-scratch-in-bengali/7121)

 The model is trained on around ~11B tokens (64 size batch, 512 tokens, 350k steps).
+## load tokenizer
+```
+>>> tokenizer = transformers.AutoTokenizer.from_pretrained("flax-community/bengali-t5-large")
+>>> tokenizer.encode("আমি বাংলার গান গাই")
+>>> tokenizer.decode([93, 1912, 814, 5995, 3, 1])
+```
+```
+[93, 1912, 814, 5995, 3, 1]
+'আমি বাংলার গান গাই </s>'
+```
+## load model
+```
+config  = T5Config.from_pretrained("flax-community/bengali-t5-base")
+model = FlaxT5ForConditionalGeneration.from_pretrained("flax-community/bengali-t5-base", config=config)
+```
+Please note that we haven't finetuned the model in any downstream task. If you are finetuning the model in any downstream task, please let us know about it. Shoot us an email (sbmaruf at gmail dot com)
 ## Proposal
 - [Project Proposal](https://discuss.huggingface.co/t/pretrain-t5-from-scratch-in-bengali/7121)