text-summarization / README.md
agentlans's picture
Update README.md
dc61e31 verified
metadata
library_name: transformers
language:
  - en
tags:
  - text-summarization
  - t5
  - generated_from_trainer
license: apache-2.0
base_model: Falconsai/text_summarization
datasets:
  - agentlans/wikipedia-paragraph-summaries

Text Summarization Model

This model is designed to summarize English paragraphs effectively, condensing the main ideas while preserving the essential information and context. It's a fine-tuned version of Falconsai/text_summarization on the agentlans/wikipedia-paragraph-summaries dataset.

Intended Use

The model is intended for applications such as:

  • Summarizing articles and documents
  • Assisting in content curation
  • Enhancing information retrieval systems
  • Supporting educational tools by providing concise summaries

Usage Instructions

from transformers import pipeline

summarizer = pipeline("summarization", model="agentlans/text-summarization")

ARTICLE = "Your text here..."
print(summarizer(ARTICLE, max_length=1000, min_length=30, do_sample=False))

Examples

First three paragraphs of one of Winston Churchill's speeches summarized by

Paragraph 1

Original text: From the moment that the French defences at Sedan and on the Meuse were broken at the end of the second week of May, only a rapid retreat to Amiens and the south could have saved the British and French Armies who had entered Belgium at the appeal of the Belgian King, but this strategic fact was not immediately realised. The French Command hoped they would be able to close the gap, and the Armies of the north were under their orders. Moreover, a retirement of this kind would have involved almost certainly the destruction of the fine Belgian Army of over 20 divisions and the abandonment of the whole of Belgium. Therefore, when the force and scope of the German penetration were realised and when a new French Generalissimo, General Weygand, assumed command in place of General Gamelin, an effort was made by the French and British Armies in Belgium to keep on holding the right hand of the Belgians and to give their own right hand to a newly created French Army which was to have advanced across the Somme in great strength to grasp it.

Before finetune: The French Command hoped they would be able to close the gap, and the Armies of the north were under their orders . a retirement of this kind would have involved almost certainly the destruction of the fine Belgian Army of over 20 divisions and the abandonment of the whole of Belgium .

After finetune: The French and French Armies of the north were unable to close the gap, but a retirement would have involved the destruction of the Belgian Army and the abandonment of Belgium.

Paragraph 2

Original text: However, the German eruption swept like a sharp scythe around the right and rear of the Armies of the north. Eight or nine armoured divisions, each of about 400 armoured vehicles of different kinds, but carefully assorted to be complementary and divisible into small self-contained units, cut off all communications between us and the main French Armies. It severed our own communications for food and ammunition, which ran first to Amiens and afterwards through Abbeville, and it shore its way up the coast to Boulogne and Calais, and almost to Dunkirk. Behind this armoured and mechanised onslaught came a number of German divisions in lorries, and behind them again there plodded comparatively slowly the dull brute mass of the ordinary German Army and German people, always so ready to be led to the trampling down in other lands of liberties and comforts which they have never known in their own.

Before finetune: German eruption swept like a sharp scythe around the right and rear of the Armies of the north . Eight or nine armoured divisions cut off all communications between us and the main French Armies . Behind this onslaught came a number of German divisions in lorries, and behind them again there plodded slowly the dull brute mass of the ordinary German Army and German people .

After finetune: the German eruption swept around the armies of the north, cutting off communications between us and the main French Armies, and plodded a number of German divisions in lorries, ploding comparatively slowly the dull brute mass of the ordinary German Army and German people.

Paragraph 3

Original text: I have said this armoured scythe-stroke almost reached Dunkirk—almost but not quite. Boulogne and Calais were the scenes of desperate fighting. The Guards defended Boulogne for a while and were then withdrawn by orders from this country. The Rifle Brigade, the 60th Rifles, and the Queen Victoria's Rifles, with a battalion of British tanks and 1,000 Frenchmen, in all about 4,000 strong, defended Calais to the last. The British Brigadier was given an hour to surrender. He spurned the offer, and four days of intense street fighting passed before silence reigned over Calais, which marked the end of a memorable resistance. Only 30 unwounded survivors were brought off by the Navy and we do not know the fate of their comrades. Their sacrifice, however, was not in vain. At least two armoured divisions, which otherwise would have been turned against the British Expeditionary Force, had to be sent to overcome them. They have added another page to the glories of the Light Division, and the time gained enabled the Graveline waterlines to be flooded and to be held by the French troops.

Before finetune: Boulogne and Calais were the scenes of desperate fighting . They were withdrawn by orders from this country . The British Brigadier was given an hour to surrender . Only 30 unwounded survivors were brought off by the Navy .

After finetune: Boulogne and Calais were the scenes of desperate fighting in Dunkirk, with the Guards defending them for a while and the British Brigadier being given an hour to surrender.

Training procedure

Technical information

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10.0

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.0

Limitations

  • Language: English only.
  • Context Sensitivity: While the model performs well on general topics, it may struggle with highly specialized or technical content.
  • Bias: The model may reflect biases present in the training data, particularly those found in Wikipedia articles.
  • Length Limitations: The model performs best on long paragraphs that don't exceed 512 tokens. Very short paragraphs aren't suitable for summarization.

Ethical Considerations

  • Bias and Fairness: Users should be aware of potential biases in the model's outputs, which may arise from the training data.
  • Misinformation: The model should not be used as the sole source of information, especially in critical applications, as it may inadvertently summarize misleading or inaccurate content.