|
--- |
|
license: mit |
|
language: |
|
- ne |
|
metrics: |
|
- rouge |
|
pipeline_tag: text-generation |
|
tags: |
|
- Nepali summary |
|
- Nepali bart |
|
- Nepali |
|
- summary |
|
--- |
|
# Nep_Summ_BART: |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
This model is pre-trained using BART on Nepali corpus and then fine-tuned on Nepali summary data. |
|
<br>The model generates a summary for the text input. |
|
|
|
The parameter size for the model is 101M. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
The model is trained using BART noising techniques like sentence permutation, token deletion, and random token masking. |
|
<br>The noisy data is fed into the encoder of the transformer and the denoising task/ objective is fulfilled by the decoder of the transformer model. |
|
|
|
Normal cross-entropy loss is used for both the pre-training and fine-tuning of the model. |
|
|
|
The Loss for pre-training is as follows: |
|
|
|
| Epoch | Training Loss | Val Loss | |
|
|----------|:-------------:|------:| |
|
| 1 | 0.8137 | 0.8010 | |
|
| 2 | 0.7861 | 0.7524 | |
|
| 3 | 0.7495 | 0.7290 | |
|
|
|
The ROUGE Score for the fine-tuning, using the BBC XLSum Nepali Test Dataset is: |
|
|
|
ROUGE1 : 0.177 |
|
|
|
ROUGE2 : 0.059 |
|
|
|
ROUGEL : 0.154 |
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
You can use this model for text summarization. |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
import torch |
|
|
|
# Load model directly |
|
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("pascalrai/nep_summ_BART") |
|
model = AutoModelForSeq2SeqLM.from_pretrained("pascalrai/nep_summ_BART") |
|
|
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
|
|
sentence = """अत्यधिक माग भएका बेला दसैंमा चिनीको हाहाकार भएको थियो । उपत्यकाबाहिरका केही जिल्लामा चिनी पाइए पनि काठमाडौंमा भने अभाव नै कायम रहेको छ । प्रधानमन्त्री पुष्पकमल दाहालले बिहीबार बिहान उद्योग तथा वाणिज्य मन्त्री तथा मुख्यसचिवलाई चिनीको अभाव सिर्जना हुन नदिन सबै उपायको खोजी गर्न निर्देशन दिएका थिए । |
|
|
|
नेपाली चिनी उद्योगहरूले आम उपभोक्तालाई सहज हुने किसिमले बजारमा चिनी नपठाइ ठूला उद्योगलाई आपूर्ति गर्न गोदाममै राख्ने गरेको पनि भेटिएको छ । वाणिज्य विभागको तथ्यांक अनुसार, नेपालमा उत्पादन हुने चिनीको सत्तरी प्रतिशत चिनी बिभिन्न पेय पदार्थ, मिठाइ, चकलेट, विस्कुटलगायतका उद्योगहरुमा आपूर्ति हुने गर्दछ । |
|
|
|
नेपाल प्रहरीले नेपालमा रहेका सबै चिनी उद्योगको स्टक रेकर्ड चेक गर्ने तथा सो आधारमा बजारमा चिनी पठाउन उद्योगीहरूसँग छलफल गरिने विभागले जनाएको छ""" |
|
|
|
inputs = tokenizer(sentence, max_length=1000, return_tensors="pt") |
|
summary_ids = model.to(device).generate(inputs["input_ids"].to(device)) |
|
|
|
tokenizer.decode(summary_ids[0], skip_special_tokens=True, clean_up_tokenization_spaces=False) |
|
|
|
'दशैंका बेला सबै चिनी उद्योगको स्टक रेकर्ड गर्ने र बजारमा चिनी पठाउन उद्योगीहरूसँग छलफल गरिने अधिकारीहरूले बताएका छन्। यसकारण, उपत्यकाबाहिरका केही जिल्लामा पनि चिनीको अभाव कायम रहेको अधिकारीहरूले बताएका छन्।' |
|
|
|
#### Hardware |
|
|
|
The model was trained on a single A10G GPU in an AWS instance with each epoch taking roughly 2 days. |