End of training

Browse files

Files changed (4) hide show

README.md +24 -14
model.safetensors +1 -1
runs/Apr06_05-13-22_06dfac5f27c8/events.out.tfevents.1712380424.06dfac5f27c8.533.4 +2 -2
runs/Apr06_05-13-22_06dfac5f27c8/events.out.tfevents.1712389426.06dfac5f27c8.533.5 +3 -0

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/DialoGPT-small](https://huggingface.co/microsoft/DialoGPT-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.0097
 ## Model description
@@ -34,27 +34,37 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 4
-- eval_batch_size: 4
 - seed: 42
 - gradient_accumulation_steps: 8
-- total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 10000
-- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 4.5766        | 1.0   | 1500 | 3.9055          |
-| 3.6326        | 2.0   | 3000 | 3.3553          |
-| 3.3738        | 3.0   | 4500 | 3.1776          |
-| 3.2193        | 4.0   | 6000 | 3.0785          |
-| 3.1067        | 5.0   | 7500 | 3.0097          |
 ### Framework versions

 This model is a fine-tuned version of [microsoft/DialoGPT-small](https://huggingface.co/microsoft/DialoGPT-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.7547
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 8
+- total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 25000
+- num_epochs: 15
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch | Step  | Validation Loss |
+|:-------------:|:-----:|:-----:|:---------------:|
+| 4.5766        | 1.0   | 1500  | 3.9055          |
+| 3.6326        | 2.0   | 3000  | 3.3553          |
+| 3.3738        | 3.0   | 4500  | 3.1776          |
+| 3.2193        | 4.0   | 6000  | 3.0785          |
+| 3.1067        | 5.0   | 7500  | 3.0097          |
+| 3.0286        | 6.0   | 9000  | 2.9714          |
+| 2.961         | 7.0   | 10500 | 2.9407          |
+| 2.8925        | 8.0   | 12000 | 2.9111          |
+| 2.8291        | 9.0   | 13500 | 2.8815          |
+| 2.7617        | 10.0  | 15000 | 2.8577          |
+| 2.7061        | 11.0  | 16500 | 2.8126          |
+| 2.6515        | 12.0  | 18000 | 2.7981          |
+| 2.6165        | 13.0  | 19500 | 2.7822          |
+| 2.5813        | 14.0  | 21000 | 2.7689          |
+| 2.5213        | 15.0  | 22500 | 2.7547          |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1dd3cbaf1e57fd0caf30c8019b1f50a7460e1480f03da1046991409e7019886c
 size 497774208

 version https://git-lfs.github.com/spec/v1
+oid sha256:384a704b871198095696105bdc62e7a8e9e8599a8832d412223afeb70b4c2e00
 size 497774208

runs/Apr06_05-13-22_06dfac5f27c8/events.out.tfevents.1712380424.06dfac5f27c8.533.4 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:733c0403961fc5370fd508ff202205435dc91feec3133006f1c4a6d1dfc8784f
-size 8514

 version https://git-lfs.github.com/spec/v1
+oid sha256:824dc2ce47fb3cc4ab154eef9c39432ad518b9ac02c41e9fdcfda1eb5dec65e3
+size 9795

runs/Apr06_05-13-22_06dfac5f27c8/events.out.tfevents.1712389426.06dfac5f27c8.533.5 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:371e05d9ce8f2d0c1edde0e18b5b676382fe364a9713c10279e0471a455614cf
+size 364