BERT base model for Bangla

Pretrained BERT model for Bangla. The model was trained on Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) tasks.

Model Details

This model is based on the BERT-Base architecture with 12 layers, 768 hidden size, 12 attention heads, and 110 million parameters. The model was trained on a corpus of 39 GB Bangla text data with a vocabulary size of 50k tokens. The model was trained for 1 million steps with a batch size of 440 and a learning rate of 5e-5. The model was trained on two NVIDIA GeForce A40 GPUs.

How to use

from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained("banglagov/banBERT-Base")
tokenizer = AutoTokenizer.from_pretrained("banglagov/banBERT-Base")

text = "আমি বাংলায় পড়ি।"

tokenized_text = tokenizer(text, return_tensors="pt")
outputs = model(**tokenized_text)