scott156
/

LongT5-Large-NSPCC

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

scott156 commited on Apr 2

Commit

0e9b72a

•

1 Parent(s): c7fca06

End of training

Files changed (3) hide show

README.md +72 -0
generation_config.json +9 -0
model.safetensors +1 -1

README.md ADDED Viewed

	@@ -0,0 +1,72 @@

+---
+license: apache-2.0
+base_model: google/long-t5-tglobal-large
+tags:
+- generated_from_trainer
+metrics:
+- rouge
+model-index:
+- name: LongT5-Large-NSPCC
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# LongT5-Large-NSPCC
+This model is a fine-tuned version of [google/long-t5-tglobal-large](https://huggingface.co/google/long-t5-tglobal-large) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.5481
+- Rouge1: 0.4597
+- Rouge2: 0.1665
+- Rougel: 0.2562
+- Rougelsum: 0.2557
+- Gen Len: 250.6383
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0003
+- train_batch_size: 1
+- eval_batch_size: 1
+- seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 4
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.03
+- num_epochs: 6
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len  |
+|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:--------:|
+| 6.0521        | 1.0   | 188  | 2.8154          | 0.2268 | 0.0411 | 0.1627 | 0.1626    | 145.7447 |
+| 2.5796        | 2.0   | 377  | 1.9961          | 0.3798 | 0.1115 | 0.2103 | 0.2101    | 220.234  |
+| 2.0398        | 3.0   | 566  | 1.7703          | 0.4208 | 0.1319 | 0.2255 | 0.2258    | 299.6915 |
+| 1.7329        | 4.0   | 755  | 1.5996          | 0.4427 | 0.1488 | 0.2423 | 0.2424    | 255.2553 |
+| 1.5609        | 5.0   | 943  | 1.5510          | 0.4688 | 0.1726 | 0.2578 | 0.2576    | 289.2979 |
+| 1.4733        | 5.98  | 1128 | 1.5481          | 0.4597 | 0.1665 | 0.2562 | 0.2557    | 250.6383 |
+### Framework versions
+- Transformers 4.39.2
+- Pytorch 2.2.1+cu121
+- Datasets 2.18.0
+- Tokenizers 0.15.2

generation_config.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+  "decoder_start_token_id": 0,
+  "eos_token_id": 1,
+  "max_new_tokens": 400,
+  "no_repeat_ngram_size": 5,
+  "num_beams": 3,
+  "pad_token_id": 0,
+  "transformers_version": "4.39.2"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:56161b2fa76bd832c2ea0df03ffdbb7ce637cdf6accf8e627166c011273ec775
 size 3132774536

 version https://git-lfs.github.com/spec/v1
+oid sha256:fa2f3eb3c63f6f12345249ed45486bfa1ea0d26880dbcaa962c26ace69c85a07
 size 3132774536