Bengali GPT-2

Bengali GPT-2 demo. Part of the Huggingface JAX/Flax event. Also features a finetuned model on bengali song lyrics.

Model Description

OpenAI GPT-2 model was proposed in Language Models are Unsupervised Multitask Learners paper .Original GPT2 model was a causal (unidirectional) transformer pretrained using language modeling on a very large corpus of ~40 GB of text data. This model has same configuration but has been pretrained on bengali corpus of mC4(multilingual C4) dataset. The code for training the model has all been open-sourced here.

Training Details

Overall Result:

Eval loss : 1.45, Eval Perplexity : 3.141

Data: mC4-bn

Train Steps: 250k steps

link 🤗 flax-community/gpt2-bengali

Demo : https://huggingface.co/spaces/flax-community/Gpt2-bengali

Usage

For using the model there are multiple options available. For example using the pipeline directly we can try to generate sentences.

from transformers import pipeline

gpt2_bengali = pipeline('text-generation',model="flax-community/gpt2-bengali", tokenizer='flax-community/gpt2-bengali')

Similarly for using the finetuned model on bangla songs we can use following.

from transformers import pipeline

singer = pipeline('text-generation',model="khalidsaifullaah/bengali-lyricist-gpt2", tokenizer='khalidsaifullaah/bengali-lyricist-gpt2')

For using on other tasks the model needs to be fine-tuned on custom datasets. Details can be found in huggingface documentation

Contributors

  • Khalid Saifullah
  • Tasmiah Tahsin Mayeesha
  • Ritobrata Ghosh
  • Ibrahim Musa
  • M Saiful Bari

BibTeX entry and citation info

@misc {flax_community_2023, author = { {Flax Community} }, title = { gpt2-bengali (Revision cb8fff6) }, year = 2023, url = { https://huggingface.co/flax-community/gpt2-bengali }, doi = { 10.57967/hf/0938 }, publisher = { Hugging Face } }

Downloads last month
186
Safetensors
Model size
137M params
Tensor type
F32
·
U8
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train flax-community/gpt2-bengali

Space using flax-community/gpt2-bengali 1