sinhala-nlp
/

NSINA-Headlines-mt5-base

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Community

tharindu commited on Apr 2

Commit

dddc91c

•

1 Parent(s): 5305790

Update README.md

Files changed (1) hide show

README.md +36 -0

README.md CHANGED Viewed

@@ -1,3 +1,39 @@
 ---
 license: cc-by-sa-4.0
 ---

 ---
 license: cc-by-sa-4.0
+datasets:
+- sinhala-nlp/NSINA-Headlines
+- sinhala-nlp/NSINA
+language:
+- si
 ---
+# Sinhala Headline Generation
+This is a text generation task created with the [NSINA dataset](https://github.com/Sinhala-NLP/NSINA). This dataset is also released with the same license as NSINA. The objective of the task is to generate news headlines based on the provided news content.
+## Data
+We used the same instances from NSINA 1.0 as all the news articles had headlines. We divided this dataset into a training and test set following a 0.8 split.
+Data can be loaded into pandas dataframes using the following code.
+```python
+from datasets import Dataset
+from datasets import load_dataset
+train = Dataset.to_pandas(load_dataset('sinhala-nlp/NSINA-Headlines', split='train'))
+test = Dataset.to_pandas(load_dataset('sinhala-nlp/NSINA-Headlines', split='test'))
+```
+## Citation
+If you are using the dataset or the models, please cite the following paper.
+~~~
+@inproceedings{Nsina2024,
+author={Hettiarachchi, Hansi and Premasiri, Damith and Uyangodage, Lasitha and Ranasinghe, Tharindu},
+title={{NSINA: A News Corpus for Sinhala}},
+booktitle={The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
+year={2024},
+month={May},
+}
+~~~