lxyuan
/

distilbart-finetuned-summarization

+---
+tags:
+- generated_from_trainer
+- distilbart
+model-index:
+- name: distilbart-finetuned-summarization
+  results: []
+license: apache-2.0
+datasets:
+- cnn_dailymail
+- xsum
+- samsum
+- ccdv/pubmed-summarization
+language:
+- en
+metrics:
+- rouge
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# distilgpt2-finetuned-finance
+This model is a further fine-tuned version of [distilbart-cnn-12-6](https://huggingface.co/sshleifer/distilbart-cnn-12-6) on the the combination of 4 different summarisation datasets:
+- [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail)
+- [samsum](https://huggingface.co/datasets/samsum)
+- [xsum](https://huggingface.co/datasets/xsum)
+- [ccdv/pubmed-summarization](https://huggingface.co/datasets/ccdv/pubmed-summarization)
+Please check out the offical model page and paper:
+- [sshleifer/distilbart-cnn-12-6](https://huggingface.co/sshleifer/distilbart-cnn-12-6)
+- [Pre-trained Summarization Distillation](https://arxiv.org/abs/2010.13002)
+## Training and evaluation data
+One can reproduce the dataset using the following code:
+```python
+from datasets import DatasetDict, load_dataset
+from datasets import concatenate_datasets
+xsum_dataset = load_dataset("xsum")
+pubmed_dataset = load_dataset("ccdv/pubmed-summarization").rename_column("article", "document").rename_column("abstract", "summary")
+cnn_dataset = load_dataset("cnn_dailymail", '3.0.0').rename_column("article", "document").rename_column("highlights", "summary")
+samsum_dataset = load_dataset("samsum").rename_column("dialogue", "document")
+summary_train = concatenate_datasets([xsum_dataset["train"], pubmed_dataset["train"], cnn_dataset["train"], samsum_dataset["train"]])
+summary_validation = concatenate_datasets([xsum_dataset["validation"], pubmed_dataset["validation"], cnn_dataset["validation"], samsum_dataset["validation"]])
+summary_test = concatenate_datasets([xsum_dataset["test"], pubmed_dataset["test"], cnn_dataset["test"], samsum_dataset["test"]])
+raw_datasets = DatasetDict()
+raw_datasets["train"] = summary_train
+raw_datasets["validation"] = summary_validation
+raw_datasets["test"] = summary_test
+```
+## Inference example
+```python
+from transformers import pipeline
+pipe = pipeline("text2text-generation", model="lxyuan/distilbart-finetuned-summarization")
+text = """The tower is 324 metres (1,063 ft) tall, about the same height as
+an 81-storey building, and the tallest structure in Paris. Its base is square,
+measuring 125 metres (410 ft) on each side. During its construction, the
+Eiffel Tower surpassed the Washington Monument to become the tallest man-made
+structure in the world, a title it held for 41 years until the Chrysler Building
+in New York City was finished in 1930. It was the first structure to reach a
+height of 300 metres. Due to the addition of a broadcasting aerial at the top
+of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres
+(17 ft). Excluding transmitters, the Eiffel Tower is the second tallest
+free-standing structure in France after the Millau Viaduct.
+"""
+pipe(text)
+>>>"""The Eiffel Tower is the tallest man-made structure in the world .
+The tower is 324 metres tall, about the same height as an 81-storey building .
+Due to the addition of a broadcasting aerial in 1957, it is now taller than
+the Chrysler Building by 5.2 metres .
+"""
+```
+## Training procedure
+Notebook link: [here](https://github.com/LxYuan0420/nlp/blob/main/notebooks/distilbart-finetune-summarisation.ipynb)
+### Training hyperparameters
+The following hyperparameters were used during training:
+- evaluation_strategy="epoch",
+- save_strategy="epoch",
+- logging_strategy="epoch",
+- learning_rate=2e-5,
+- per_device_train_batch_size=2,
+- per_device_eval_batch_size=2,
+- gradient_accumulation_steps=64,
+- total_train_batch_size: 128
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- weight_decay=0.01,
+- save_total_limit=2,
+- num_train_epochs=10,
+- predict_with_generate=True,
+- fp16=True,
+- push_to_hub=True
+### Training results
+_Training is still in progress_
+| Epoch | Training Loss | Validation Loss | Rouge1 | Rouge2 | RougeL | RougeLsum | Gen Len |
+|-------|---------------|-----------------|--------|--------|--------|-----------|---------|
+| 0     | 1.779700      | 1.719054        | 40.0039| 17.9071| 27.8825| 34.8886   | 88.8936 |
+### Framework versions
+- Transformers 4.30.2
+- Pytorch 2.0.1+cu117
+- Datasets 2.13.1
+- Tokenizers 0.13.3

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8df57a92dddf33720eb147d0329d3a7dcdcce282dc0c8fe33e89e2be3e4a858e
+size 1222284056