--- license: cc base_model: facebook/bart-large-cnn tags: - generated_from_trainer datasets: - cnn_dailymail - Convosumm widget: - text: > Can we say that among the Pythagoreans the “body” of the concept was number? What do you mean by "concept body"? shell. What then is hidden behind this shell? Definition of a concept) what definition of a concept is ultimately hidden behind the body in the form of a number? All those that the Pythagoreans indicated. I want to say that numbers were their very concept. They thought in numbers as in concepts. Shape maybe?) you can say yes, but it will need to be developed on a mug. The definitions of thought are subject to numbers. On the one hand, numbers are pure abstraction, which gives initial freedom of thought for the derivation of abstract, embryonic definitions, but then for the derivation, description of reality, more specific concepts, the abstractness of numbers, on the contrary, limits, “leads into the darkness.” One is the object, “in itself”;' model-index: - name: BART-CNN-Convosumm results: - task: name: Abstractive Dialogue Summarization type: abstractive-text-summarization dataset: name: Reddit arg-filtered part of Convosumm type: Convosumm metrics: - name: Validation ROGUE-1 type: rogue-1 value: 38.6252 - name: Validation ROGUE-L type: rogue-l value: 23.902 - name: Test ROGUE-1 type: rogue-1 value: 38.3642 - name: Test ROGUE-L type: rogue-l value: 23.7782 language: - en pipeline_tag: summarization --- # BART-CNN-Convosumm ## Model description This model is a fine-tuned version of [facebook/bart-large-cnn](https://huggingface.co/facebook/bart-large-cnn) on the arg-filtered reddit part of [Convosumm](https://github.com/Yale-LILY/ConvoSumm) dataset. Model is trained for [multilanguage telegram-bot summarizer](https://github.com/akaRemeris/XLConvosumm-bot). ## Intended uses & limitations Input expected: unstructured set of concatenated messages without nickname-message indexing. ## Training and evaluation data More information needed ## Training procedure Wandb logged [results](https://wandb.ai/remeris/BART-CNN-Convosumm/runs/68syxthd). ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 3e-05 - train_batch_size: 1 - eval_batch_size: 1 - seed: 42 - gradient_accumulation_steps: 20 - total_train_batch_size: 20 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: polynomial - lr_scheduler_warmup_steps: 1 - num_epochs: 7 - label_smoothing_factor: 0.1 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:| | 6.207 | 1.0 | 10 | 4.2651 | 32.3341 | 7.812 | 20.0411 | 29.4849 | 77.38 | | 4.0248 | 1.99 | 20 | 3.9903 | 36.0787 | 11.0447 | 21.3596 | 33.2903 | 130.58 | | 3.5933 | 2.99 | 30 | 3.9020 | 34.2931 | 11.2036 | 20.7935 | 30.8361 | 140.02 | | 3.3086 | 3.98 | 40 | 3.8712 | 38.4842 | 11.9947 | 23.4913 | 34.4347 | 85.78 | | 3.112 | 4.98 | 50 | 3.8700 | 38.652 | 11.8315 | 23.5208 | 34.5998 | 76.2 | | 2.9933 | 5.97 | 60 | 3.8809 | 38.66 | 12.3337 | 23.4394 | 35.1976 | 83.26 | | 2.834 | 6.97 | 70 | 3.8797 | 38.6252 | 12.2556 | 23.902 | 34.6324 | 81.28 | It achieves the following results on the evaluation set (50 data points): - Loss: 3.8797 - Rouge1: 38.6252 - Rouge2: 12.2556 - Rougel: 23.902 - Rougelsum: 34.6324 - Gen Len: 81.28 It achieves the following results on the test set (250 data points): - Loss: 3.8343 - Rouge1: 38.3642 - Rouge2: 12.2056 - Rougel: 23.7782 - Rougelsum: 34.3959 - Gen Len: 84.132 ### Framework versions - Transformers 4.35.2 - Pytorch 2.0.0 - Datasets 2.1.0 - Tokenizers 0.15.0