Model save

Browse files

Files changed (7) hide show

README.md +113 -0
all_results.json +9 -0
generation_config.json +7 -0
model.safetensors +1 -1
runs/Jun27_22-38-11_poseidon/events.out.tfevents.1719528220.poseidon.708792.0 +2 -2
train_results.json +9 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,113 @@

+---
+license: apache-2.0
+base_model: martimfasantos/tinyllama-1.1b-sum-sft-full_old
+tags:
+- trl
+- dpo
+- generated_from_trainer
+model-index:
+- name: tinyllama-1.1b-sum-dpo-full_LR5e-8_BS64_3epochs_old
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# tinyllama-1.1b-sum-dpo-full_LR5e-8_BS64_3epochs_old
+This model is a fine-tuned version of [martimfasantos/tinyllama-1.1b-sum-sft-full_old](https://huggingface.co/martimfasantos/tinyllama-1.1b-sum-sft-full_old) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.6852
+- Rewards/chosen: -0.0659
+- Rewards/rejected: -0.0838
+- Rewards/accuracies: 0.6006
+- Rewards/margins: 0.0179
+- Logps/rejected: -71.5612
+- Logps/chosen: -65.3069
+- Logits/rejected: -3.0327
+- Logits/chosen: -3.0385
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-08
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 3
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6931        | 0.0689 | 100  | 0.6932          | -0.0000        | 0.0001           | 0.4809             | -0.0001         | -63.1742       | -58.7157     | -3.1575         | -3.1631       |
+| 0.6931        | 0.1378 | 200  | 0.6932          | -0.0001        | -0.0000          | 0.4735             | -0.0001         | -63.1804       | -58.7190     | -3.1577         | -3.1633       |
+| 0.693         | 0.2068 | 300  | 0.6931          | 0.0002         | 0.0002           | 0.5044             | 0.0000          | -63.1651       | -58.6934     | -3.1573         | -3.1630       |
+| 0.6929        | 0.2757 | 400  | 0.6931          | 0.0004         | 0.0004           | 0.4928             | 0.0000          | -63.1405       | -58.6678     | -3.1565         | -3.1621       |
+| 0.6925        | 0.3446 | 500  | 0.6930          | 0.0009         | 0.0005           | 0.5374             | 0.0004          | -63.1296       | -58.6253     | -3.1548         | -3.1605       |
+| 0.6919        | 0.4135 | 600  | 0.6928          | 0.0012         | 0.0006           | 0.5644             | 0.0006          | -63.1213       | -58.5903     | -3.1529         | -3.1585       |
+| 0.6917        | 0.4824 | 700  | 0.6926          | 0.0017         | 0.0006           | 0.5562             | 0.0011          | -63.1193       | -58.5436     | -3.1505         | -3.1562       |
+| 0.6905        | 0.5513 | 800  | 0.6924          | 0.0019         | 0.0003           | 0.5681             | 0.0016          | -63.1495       | -58.5180     | -3.1471         | -3.1528       |
+| 0.6898        | 0.6203 | 900  | 0.6920          | 0.0018         | -0.0004          | 0.5839             | 0.0023          | -63.2244       | -58.5291     | -3.1427         | -3.1484       |
+| 0.6894        | 0.6892 | 1000 | 0.6918          | 0.0013         | -0.0015          | 0.5699             | 0.0028          | -63.3282       | -58.5803     | -3.1380         | -3.1437       |
+| 0.6894        | 0.7581 | 1100 | 0.6915          | 0.0004         | -0.0030          | 0.5718             | 0.0033          | -63.4761       | -58.6734     | -3.1327         | -3.1383       |
+| 0.6886        | 0.8270 | 1200 | 0.6912          | -0.0007        | -0.0048          | 0.5704             | 0.0041          | -63.6618       | -58.7859     | -3.1285         | -3.1342       |
+| 0.6878        | 0.8959 | 1300 | 0.6907          | -0.0026        | -0.0077          | 0.5802             | 0.0051          | -63.9501       | -58.9768     | -3.1220         | -3.1276       |
+| 0.6872        | 0.9649 | 1400 | 0.6904          | -0.0047        | -0.0104          | 0.5869             | 0.0057          | -64.2244       | -59.1855     | -3.1181         | -3.1238       |
+| 0.6865        | 1.0338 | 1500 | 0.6902          | -0.0077        | -0.0140          | 0.5869             | 0.0063          | -64.5792       | -59.4787     | -3.1117         | -3.1174       |
+| 0.6855        | 1.1027 | 1600 | 0.6898          | -0.0109        | -0.0180          | 0.5839             | 0.0071          | -64.9847       | -59.8052     | -3.1071         | -3.1128       |
+| 0.6842        | 1.1716 | 1700 | 0.6895          | -0.0156        | -0.0234          | 0.5827             | 0.0079          | -65.5234       | -60.2681     | -3.1002         | -3.1059       |
+| 0.6842        | 1.2405 | 1800 | 0.6890          | -0.0215        | -0.0304          | 0.5876             | 0.0089          | -66.2193       | -60.8594     | -3.0947         | -3.1005       |
+| 0.6804        | 1.3094 | 1900 | 0.6888          | -0.0253        | -0.0347          | 0.5911             | 0.0095          | -66.6540       | -61.2379     | -3.0896         | -3.0952       |
+| 0.6827        | 1.3784 | 2000 | 0.6883          | -0.0299        | -0.0405          | 0.5971             | 0.0107          | -67.2341       | -61.6997     | -3.0847         | -3.0904       |
+| 0.6805        | 1.4473 | 2100 | 0.6879          | -0.0345        | -0.0461          | 0.5980             | 0.0116          | -67.7896       | -62.1622     | -3.0798         | -3.0855       |
+| 0.68          | 1.5162 | 2200 | 0.6876          | -0.0374        | -0.0495          | 0.5929             | 0.0121          | -68.1323       | -62.4511     | -3.0751         | -3.0808       |
+| 0.6805        | 1.5851 | 2300 | 0.6873          | -0.0420        | -0.0550          | 0.5908             | 0.0130          | -68.6762       | -62.9119     | -3.0705         | -3.0763       |
+| 0.6802        | 1.6540 | 2400 | 0.6870          | -0.0440        | -0.0575          | 0.5936             | 0.0135          | -68.9288       | -63.1075     | -3.0657         | -3.0714       |
+| 0.6788        | 1.7229 | 2500 | 0.6868          | -0.0465        | -0.0604          | 0.5950             | 0.0140          | -69.2231       | -63.3570     | -3.0616         | -3.0674       |
+| 0.6784        | 1.7919 | 2600 | 0.6865          | -0.0493        | -0.0639          | 0.5948             | 0.0146          | -69.5742       | -63.6419     | -3.0568         | -3.0626       |
+| 0.6771        | 1.8608 | 2700 | 0.6863          | -0.0524        | -0.0676          | 0.5943             | 0.0152          | -69.9422       | -63.9527     | -3.0530         | -3.0588       |
+| 0.676         | 1.9297 | 2800 | 0.6861          | -0.0553        | -0.0710          | 0.5892             | 0.0157          | -70.2780       | -64.2370     | -3.0501         | -3.0558       |
+| 0.6793        | 1.9986 | 2900 | 0.6860          | -0.0571        | -0.0731          | 0.5922             | 0.0160          | -70.4908       | -64.4251     | -3.0474         | -3.0532       |
+| 0.6755        | 2.0675 | 3000 | 0.6858          | -0.0592        | -0.0755          | 0.5929             | 0.0163          | -70.7265       | -64.6294     | -3.0442         | -3.0500       |
+| 0.678         | 2.1365 | 3100 | 0.6856          | -0.0600        | -0.0768          | 0.5941             | 0.0168          | -70.8605       | -64.7164     | -3.0422         | -3.0480       |
+| 0.6795        | 2.2054 | 3200 | 0.6855          | -0.0611        | -0.0781          | 0.5941             | 0.0170          | -70.9855       | -64.8209     | -3.0400         | -3.0457       |
+| 0.6784        | 2.2743 | 3300 | 0.6854          | -0.0619        | -0.0791          | 0.5969             | 0.0172          | -71.0930       | -64.9018     | -3.0382         | -3.0440       |
+| 0.6792        | 2.3432 | 3400 | 0.6853          | -0.0627        | -0.0801          | 0.5946             | 0.0175          | -71.1919       | -64.9777     | -3.0366         | -3.0423       |
+| 0.6769        | 2.4121 | 3500 | 0.6853          | -0.0636        | -0.0811          | 0.5953             | 0.0175          | -71.2883       | -65.0695     | -3.0356         | -3.0414       |
+| 0.6771        | 2.4810 | 3600 | 0.6852          | -0.0645        | -0.0822          | 0.5978             | 0.0177          | -71.3953       | -65.1583     | -3.0346         | -3.0404       |
+| 0.6785        | 2.5500 | 3700 | 0.6851          | -0.0650        | -0.0829          | 0.5997             | 0.0179          | -71.4696       | -65.2152     | -3.0340         | -3.0397       |
+| 0.6779        | 2.6189 | 3800 | 0.6851          | -0.0655        | -0.0833          | 0.5962             | 0.0179          | -71.5138       | -65.2594     | -3.0332         | -3.0390       |
+| 0.6775        | 2.6878 | 3900 | 0.6851          | -0.0657        | -0.0836          | 0.5974             | 0.0179          | -71.5451       | -65.2842     | -3.0331         | -3.0389       |
+| 0.6757        | 2.7567 | 4000 | 0.6851          | -0.0658        | -0.0837          | 0.5985             | 0.0179          | -71.5477       | -65.2925     | -3.0326         | -3.0384       |
+| 0.6759        | 2.8256 | 4100 | 0.6850          | -0.0658        | -0.0839          | 0.6022             | 0.0181          | -71.5705       | -65.2951     | -3.0324         | -3.0382       |
+| 0.6755        | 2.8946 | 4200 | 0.6852          | -0.0659        | -0.0838          | 0.5990             | 0.0178          | -71.5600       | -65.3068     | -3.0326         | -3.0384       |
+| 0.6803        | 2.9635 | 4300 | 0.6852          | -0.0659        | -0.0838          | 0.6006             | 0.0179          | -71.5612       | -65.3069     | -3.0327         | -3.0385       |
+### Framework versions
+- Transformers 4.41.2
+- Pytorch 2.1.2
+- Datasets 2.20.0
+- Tokenizers 0.19.1

all_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 3.0,
+    "total_flos": 0.0,
+    "train_loss": 0.6826107928102766,
+    "train_runtime": 83562.7786,
+    "train_samples": 92858,
+    "train_samples_per_second": 3.334,
+    "train_steps_per_second": 0.052
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "max_length": 2048,
+  "pad_token_id": 0,
+  "transformers_version": "4.41.2"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6fcaf3716b24ca2489f395b78c96e2b38008f53d4542f890cde2f43feca86ea5
 size 4400216536

 version https://git-lfs.github.com/spec/v1
+oid sha256:d4516be081c4c17228e26e834ecf82926bf6be43bb0db0e856112ed56eec90e4
 size 4400216536

runs/Jun27_22-38-11_poseidon/events.out.tfevents.1719528220.poseidon.708792.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:59b25651dc68d54d3b843b09580a2f61297c133a927738d6f929ec1003f964f8
-size 333358

 version https://git-lfs.github.com/spec/v1
+oid sha256:baa57cefa3a49b01c7797935de57030b57ec1f62a5fbb147ac35de918f5eafd9
+size 337152

train_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 3.0,
+    "total_flos": 0.0,
+    "train_loss": 0.6826107928102766,
+    "train_runtime": 83562.7786,
+    "train_samples": 92858,
+    "train_samples_per_second": 3.334,
+    "train_steps_per_second": 0.052
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff