BART-CNN-Convosumm
Model description
This model is a fine-tuned version of facebook/bart-large-cnn on the arg-filtered reddit part of Convosumm dataset. Model is trained for multilanguage telegram-bot summarizer.
Intended uses & limitations
Input expected: unstructured set of concatenated messages without nickname-message indexing.
Training and evaluation data
More information needed
Training procedure
Wandb logged results.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 20
- total_train_batch_size: 20
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: polynomial
- lr_scheduler_warmup_steps: 1
- num_epochs: 7
- label_smoothing_factor: 0.1
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
6.207 | 1.0 | 10 | 4.2651 | 32.3341 | 7.812 | 20.0411 | 29.4849 | 77.38 |
4.0248 | 1.99 | 20 | 3.9903 | 36.0787 | 11.0447 | 21.3596 | 33.2903 | 130.58 |
3.5933 | 2.99 | 30 | 3.9020 | 34.2931 | 11.2036 | 20.7935 | 30.8361 | 140.02 |
3.3086 | 3.98 | 40 | 3.8712 | 38.4842 | 11.9947 | 23.4913 | 34.4347 | 85.78 |
3.112 | 4.98 | 50 | 3.8700 | 38.652 | 11.8315 | 23.5208 | 34.5998 | 76.2 |
2.9933 | 5.97 | 60 | 3.8809 | 38.66 | 12.3337 | 23.4394 | 35.1976 | 83.26 |
2.834 | 6.97 | 70 | 3.8797 | 38.6252 | 12.2556 | 23.902 | 34.6324 | 81.28 |
It achieves the following results on the evaluation set (50 data points):
- Loss: 3.8797
- Rouge1: 38.6252
- Rouge2: 12.2556
- Rougel: 23.902
- Rougelsum: 34.6324
- Gen Len: 81.28
It achieves the following results on the test set (250 data points):
- Loss: 3.8343
- Rouge1: 38.3642
- Rouge2: 12.2056
- Rougel: 23.7782
- Rougelsum: 34.3959
- Gen Len: 84.132
Framework versions
- Transformers 4.35.2
- Pytorch 2.0.0
- Datasets 2.1.0
- Tokenizers 0.15.0
- Downloads last month
- 94
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for Remeris/BART-CNN-Convosumm
Base model
facebook/bart-large-cnnDataset used to train Remeris/BART-CNN-Convosumm
Evaluation results
- Validation ROGUE-1 on Reddit arg-filtered part of Convosummself-reported38.625
- Validation ROGUE-L on Reddit arg-filtered part of Convosummself-reported23.902
- Test ROGUE-1 on Reddit arg-filtered part of Convosummself-reported38.364
- Test ROGUE-L on Reddit arg-filtered part of Convosummself-reported23.778