madushakv
/

t5_xsum_samsum_billsum_cnn_dailymail

text2text-generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

t5_xsum_samsum_billsum_cnn_dailymail / README.md

madushakv's picture

Update README.md

575b79d 12 months ago

|

history blame contribute delete

No virus

2.94 kB

	---
	tags:
	- generated_from_trainer
	datasets:
	- cnn_dailymail
	- xsum
	- samsum
	- billsum
	- lytang/MeetingBank-transcript
	metrics:
	- rouge
	model-index:
	- name: t5_xsum_samsum_billsum_cnn_dailymail
	results:
	- task:
	name: Sequence-to-sequence Language Modeling
	type: text2text-generation
	dataset:
	name: cnn_dailymail
	type: cnn_dailymail
	config: 3.0.0
	split: train
	args: 3.0.0
	metrics:
	- name: Rouge1
	type: rouge
	value: 0.2373
	license: mit
	language:
	- en
	library_name: transformers
	pipeline_tag: summarization
	---

	# t5_xsum_samsum_billsum_cnn_dailymail

	The `t5_xsum_samsum_billsum_cnn_dailymail` model is a text summarization model fine-tuned on the `t5-base` architecture, which is a versatile text-to-text transfer transformer. This powerful model excels at generating abstractive summaries from input text. It has been fine-tuned on multiple datasets, including CNN/Daily Mail (cnn_dailymail), XSum (xsum), SamSum (samsum), BillSum (billsum), and the MeetingBank-transcript dataset by lytang.

	## Intended Uses & Limitations

	### Intended Uses

	- Document summarization: The model is well-suited for summarizing lengthy documents or articles, making it valuable for content curation and information extraction tasks.
	- Content generation: It can be used to generate concise summaries from input text, which is useful for creating short and informative snippets.

	### Limitations

	- Model size: The model's size may require significant computational resources for deployment, limiting its use in resource-constrained environments.
	- Domain-specific content: While it performs well on general text summarization tasks, its performance may vary when applied to domain-specific content.

	## Training and Evaluation Data

	The model has been trained on a diverse set of datasets, including CNN/Daily Mail, XSum, SamSum, BillSum, and the MeetingBank-transcript dataset. These datasets provide a wide range of text summarization examples, enabling the model to generalize across various domains and styles of text.

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 1

	### Training results

	#### samsum

	\| Rouge1 \| Rouge2 \| RougeL \| RougeLsum \|
	\|:-------:\|:-------:\|:-------:\|:---------:\|
	\| 0.0138 \| 0.0002 \| 0.0138 \| 0.0138 \|


	#### CNN_Dailymail

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|:-------:\|
	\| 1.8486 \| 1.0 \| 32300 \| 1.6478 \| 0.2373 \| 0.1086 \| 0.1972 \| 0.1971 \| 18.9674 \|


	### Framework versions

	- Transformers 4.33.0
	- Pytorch 2.0.0
	- Datasets 2.1.0
	- Tokenizers 0.13.3