Model save

Browse files

Files changed (7) hide show

README.md +113 -0
all_results.json +9 -0
generation_config.json +7 -0
model.safetensors +1 -1
runs/Jun07_09-08-56_poseidon/events.out.tfevents.1717751736.poseidon.3198658.0 +2 -2
train_results.json +9 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,113 @@

+---
+license: apache-2.0
+base_model: martimfasantos/tinyllama-1.1b-sum-sft-full
+tags:
+- trl
+- dpo
+- generated_from_trainer
+model-index:
+- name: tinyllama-1.1b-sum-dpo-full_LR5e-7_3epochs
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# tinyllama-1.1b-sum-dpo-full_LR5e-7_3epochs
+This model is a fine-tuned version of [martimfasantos/tinyllama-1.1b-sum-sft-full](https://huggingface.co/martimfasantos/tinyllama-1.1b-sum-sft-full) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.7095
+- Rewards/chosen: -2.8594
+- Rewards/rejected: -3.4153
+- Rewards/accuracies: 0.6336
+- Rewards/margins: 0.5559
+- Logps/rejected: -404.2798
+- Logps/chosen: -344.9560
+- Logits/rejected: -1.9830
+- Logits/chosen: -2.0075
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-07
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 3
+### Training results
+| Training Loss | Epoch  | Step  | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:------:|:-----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.689         | 0.0689 | 400   | 0.6921          | 0.0010         | -0.0011          | 0.5616             | 0.0021          | -62.8638       | -58.9160     | -2.9633         | -2.9669       |
+| 0.6822        | 0.1378 | 800   | 0.6861          | -0.0503        | -0.0663          | 0.5746             | 0.0160          | -69.3792       | -64.0464     | -2.9255         | -2.9291       |
+| 0.6737        | 0.2068 | 1200  | 0.6780          | -0.2790        | -0.3169          | 0.5762             | 0.0379          | -94.4367       | -86.9165     | -2.8527         | -2.8562       |
+| 0.6648        | 0.2757 | 1600  | 0.6677          | -0.4500        | -0.5183          | 0.6029             | 0.0683          | -114.5829      | -104.0142    | -2.7578         | -2.7612       |
+| 0.6678        | 0.3446 | 2000  | 0.6576          | -0.7094        | -0.8175          | 0.6217             | 0.1081          | -144.4979      | -129.9582    | -2.6611         | -2.6651       |
+| 0.6253        | 0.4135 | 2400  | 0.6468          | -1.0987        | -1.2558          | 0.6236             | 0.1571          | -188.3249      | -168.8844    | -2.4966         | -2.5038       |
+| 0.6616        | 0.4824 | 2800  | 0.6473          | -0.7839        | -0.9244          | 0.6303             | 0.1405          | -155.1877      | -137.4051    | -2.4668         | -2.4737       |
+| 0.6282        | 0.5513 | 3200  | 0.6395          | -1.3763        | -1.5943          | 0.6331             | 0.2181          | -222.1840      | -196.6437    | -2.2441         | -2.2573       |
+| 0.5886        | 0.6203 | 3600  | 0.6382          | -1.2763        | -1.4872          | 0.6355             | 0.2109          | -211.4734      | -186.6474    | -2.1487         | -2.1634       |
+| 0.5903        | 0.6892 | 4000  | 0.6398          | -1.0104        | -1.2131          | 0.6366             | 0.2027          | -184.0546      | -160.0534    | -2.1888         | -2.2035       |
+| 0.5886        | 0.7581 | 4400  | 0.6349          | -1.2844        | -1.5732          | 0.6341             | 0.2888          | -220.0676      | -187.4508    | -2.0898         | -2.1111       |
+| 0.5907        | 0.8270 | 4800  | 0.6306          | -1.3443        | -1.6135          | 0.6478             | 0.2692          | -224.0959      | -193.4449    | -2.0942         | -2.1137       |
+| 0.5456        | 0.8959 | 5200  | 0.6327          | -1.1753        | -1.4199          | 0.6408             | 0.2446          | -204.7423      | -176.5441    | -2.1214         | -2.1394       |
+| 0.5465        | 0.9649 | 5600  | 0.6325          | -1.2769        | -1.5500          | 0.6371             | 0.2731          | -217.7467      | -186.7071    | -2.0669         | -2.0872       |
+| 0.4632        | 1.0338 | 6000  | 0.6484          | -2.1822        | -2.6404          | 0.6496             | 0.4582          | -326.7876      | -277.2339    | -1.8836         | -1.9125       |
+| 0.4736        | 1.1027 | 6400  | 0.6454          | -2.1568        | -2.5961          | 0.6547             | 0.4393          | -322.3579      | -274.6943    | -1.8531         | -1.8794       |
+| 0.4665        | 1.1716 | 6800  | 0.6386          | -1.8958        | -2.2728          | 0.6443             | 0.3770          | -290.0295      | -248.5992    | -1.8821         | -1.9042       |
+| 0.4789        | 1.2405 | 7200  | 0.6483          | -1.9198        | -2.2931          | 0.6403             | 0.3733          | -292.0611      | -250.9941    | -1.9443         | -1.9659       |
+| 0.5477        | 1.3094 | 7600  | 0.6413          | -1.7843        | -2.1677          | 0.6499             | 0.3834          | -279.5165      | -237.4425    | -1.9622         | -1.9845       |
+| 0.4423        | 1.3784 | 8000  | 0.6528          | -2.0003        | -2.3620          | 0.6415             | 0.3617          | -298.9479      | -259.0417    | -1.9266         | -1.9469       |
+| 0.4668        | 1.4473 | 8400  | 0.6515          | -1.8405        | -2.1818          | 0.6403             | 0.3413          | -280.9325      | -243.0684    | -1.9825         | -2.0027       |
+| 0.509         | 1.5162 | 8800  | 0.6471          | -1.9547        | -2.3166          | 0.6424             | 0.3619          | -294.4091      | -254.4828    | -2.0224         | -2.0422       |
+| 0.4177        | 1.5851 | 9200  | 0.6542          | -1.9336        | -2.3034          | 0.6392             | 0.3699          | -293.0923      | -252.3707    | -1.9854         | -2.0064       |
+| 0.4181        | 1.6540 | 9600  | 0.6626          | -2.3352        | -2.8057          | 0.6438             | 0.4706          | -343.3230      | -292.5314    | -1.9265         | -1.9501       |
+| 0.4469        | 1.7229 | 10000 | 0.6436          | -1.8037        | -2.1726          | 0.6431             | 0.3689          | -280.0089      | -239.3807    | -2.0388         | -2.0591       |
+| 0.4365        | 1.7919 | 10400 | 0.6446          | -1.7691        | -2.1263          | 0.6466             | 0.3572          | -275.3837      | -235.9303    | -2.0443         | -2.0637       |
+| 0.4488        | 1.8608 | 10800 | 0.6558          | -2.1203        | -2.5393          | 0.6450             | 0.4190          | -316.6843      | -271.0489    | -2.0317         | -2.0535       |
+| 0.4611        | 1.9297 | 11200 | 0.6646          | -2.4708        | -2.9416          | 0.6468             | 0.4708          | -356.9083      | -306.0948    | -1.9987         | -2.0224       |
+| 0.4546        | 1.9986 | 11600 | 0.6541          | -2.2751        | -2.7321          | 0.6436             | 0.4570          | -335.9583      | -286.5284    | -1.9967         | -2.0195       |
+| 0.3836        | 2.0675 | 12000 | 0.6827          | -2.7558        | -3.3214          | 0.6464             | 0.5655          | -394.8881      | -334.6001    | -1.9585         | -1.9844       |
+| 0.337         | 2.1365 | 12400 | 0.7083          | -3.2136        | -3.8269          | 0.6424             | 0.6132          | -445.4347      | -380.3789    | -1.9217         | -1.9480       |
+| 0.3756        | 2.2054 | 12800 | 0.6892          | -2.5637        | -3.0760          | 0.6378             | 0.5123          | -370.3519      | -315.3893    | -1.9938         | -2.0171       |
+| 0.4071        | 2.2743 | 13200 | 0.6989          | -2.7240        | -3.2763          | 0.6345             | 0.5523          | -390.3795      | -331.4143    | -1.9810         | -2.0059       |
+| 0.4236        | 2.3432 | 13600 | 0.7127          | -2.9174        | -3.4982          | 0.6329             | 0.5808          | -412.5668      | -350.7576    | -1.9542         | -1.9798       |
+| 0.3527        | 2.4121 | 14000 | 0.7006          | -2.6980        | -3.2475          | 0.6252             | 0.5496          | -387.5038      | -328.8109    | -1.9852         | -2.0098       |
+| 0.3258        | 2.4810 | 14400 | 0.7095          | -2.9212        | -3.5009          | 0.6292             | 0.5798          | -412.8438      | -351.1316    | -1.9581         | -1.9835       |
+| 0.3646        | 2.5500 | 14800 | 0.7041          | -2.7281        | -3.2711          | 0.6350             | 0.5430          | -389.8630      | -331.8257    | -1.9884         | -2.0127       |
+| 0.3596        | 2.6189 | 15200 | 0.7046          | -2.7894        | -3.3372          | 0.6359             | 0.5478          | -396.4674      | -337.9509    | -1.9862         | -2.0104       |
+| 0.3549        | 2.6878 | 15600 | 0.7067          | -2.8436        | -3.3930          | 0.6310             | 0.5494          | -402.0518      | -343.3737    | -1.9841         | -2.0084       |
+| 0.2868        | 2.7567 | 16000 | 0.7117          | -2.9064        | -3.4673          | 0.6289             | 0.5609          | -409.4747      | -349.6523    | -1.9770         | -2.0016       |
+| 0.3243        | 2.8256 | 16400 | 0.7086          | -2.8350        | -3.3883          | 0.6320             | 0.5533          | -401.5786      | -342.5143    | -1.9841         | -2.0085       |
+| 0.3963        | 2.8946 | 16800 | 0.7104          | -2.8648        | -3.4205          | 0.6301             | 0.5558          | -404.8014      | -345.4919    | -1.9835         | -2.0081       |
+| 0.3399        | 2.9635 | 17200 | 0.7095          | -2.8594        | -3.4153          | 0.6336             | 0.5559          | -404.2798      | -344.9560    | -1.9830         | -2.0075       |
+### Framework versions
+- Transformers 4.41.2
+- Pytorch 2.1.2
+- Datasets 2.19.2
+- Tokenizers 0.19.1

all_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 3.0,
+    "total_flos": 0.0,
+    "train_loss": 0.49131362667692796,
+    "train_runtime": 84120.8122,
+    "train_samples": 92858,
+    "train_samples_per_second": 3.312,
+    "train_steps_per_second": 0.207
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "max_length": 2048,
+  "pad_token_id": 0,
+  "transformers_version": "4.41.2"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8046c214462f7161775a967b35a779d57bf1ea749ed30388a5f39998ac29c839
 size 4400216536

 version https://git-lfs.github.com/spec/v1
+oid sha256:c508844a1cb0b812c37b5ac660b5f4024e1e75b6097be3d38b74d0b51a75d88e
 size 4400216536

runs/Jun07_09-08-56_poseidon/events.out.tfevents.1717751736.poseidon.3198658.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1e1bda09d7e6e95773921367571818652f715f183ad63d5015537efcc9269529
-size 1221875

 version https://git-lfs.github.com/spec/v1
+oid sha256:1292451c3394c6a04b948edf77b274f46528dd83fef14ea86de6c2200287a9c1
+size 1236935

train_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 3.0,
+    "total_flos": 0.0,
+    "train_loss": 0.49131362667692796,
+    "train_runtime": 84120.8122,
+    "train_samples": 92858,
+    "train_samples_per_second": 3.312,
+    "train_steps_per_second": 0.207
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff