VijayChoudhari commited on
Commit
b6eccb1
1 Parent(s): 8bcfaa3

End of training

Browse files
README.md CHANGED
@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  This model was trained from scratch on the None dataset.
16
  It achieves the following results on the evaluation set:
17
- - Loss: 0.4924
18
 
19
  ## Model description
20
 
@@ -34,29 +34,23 @@ More information needed
34
 
35
  The following hyperparameters were used during training:
36
  - learning_rate: 0.0001
37
- - train_batch_size: 8
38
  - eval_batch_size: 2
39
  - seed: 42
40
  - gradient_accumulation_steps: 4
41
- - total_train_batch_size: 32
42
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
43
  - lr_scheduler_type: linear
44
- - lr_scheduler_warmup_steps: 500
45
- - training_steps: 4000
46
  - mixed_precision_training: Native AMP
47
 
48
  ### Training results
49
 
50
  | Training Loss | Epoch | Step | Validation Loss |
51
  |:-------------:|:-------:|:----:|:---------------:|
52
- | 0.4823 | 11.9048 | 500 | 0.4724 |
53
- | 0.4388 | 23.8095 | 1000 | 0.4662 |
54
- | 0.4204 | 35.7143 | 1500 | 0.4590 |
55
- | 0.3981 | 47.6190 | 2000 | 0.4720 |
56
- | 0.3897 | 59.5238 | 2500 | 0.4791 |
57
- | 0.3819 | 71.4286 | 3000 | 0.4767 |
58
- | 0.3719 | 83.3333 | 3500 | 0.4906 |
59
- | 0.3658 | 95.2381 | 4000 | 0.4924 |
60
 
61
 
62
  ### Framework versions
 
14
 
15
  This model was trained from scratch on the None dataset.
16
  It achieves the following results on the evaluation set:
17
+ - Loss: 0.5768
18
 
19
  ## Model description
20
 
 
34
 
35
  The following hyperparameters were used during training:
36
  - learning_rate: 0.0001
37
+ - train_batch_size: 2
38
  - eval_batch_size: 2
39
  - seed: 42
40
  - gradient_accumulation_steps: 4
41
+ - total_train_batch_size: 8
42
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
43
  - lr_scheduler_type: linear
44
+ - lr_scheduler_warmup_steps: 200
45
+ - training_steps: 1000
46
  - mixed_precision_training: Native AMP
47
 
48
  ### Training results
49
 
50
  | Training Loss | Epoch | Step | Validation Loss |
51
  |:-------------:|:-------:|:----:|:---------------:|
52
+ | 0.3277 | 17.8571 | 500 | 0.5882 |
53
+ | 0.3036 | 35.7143 | 1000 | 0.5768 |
 
 
 
 
 
 
54
 
55
 
56
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d9a7c2e4ba755f38e809205c0d9818c337b0be02ae549df10826e87a37af64ff
3
  size 577789320
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cf31fdf90da86446687620e1e8b5ab6df0e86ee6780a64d85280151b1e99bda3
3
  size 577789320
runs/Nov28_01-55-15_22ad9fb26287/events.out.tfevents.1732758948.22ad9fb26287.441.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c1fb7494149b08b56d73c63e8c1c9760c850fe64ef0dd38a94038275401e88ac
3
- size 33676
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:57430090f9b8ce63fe95ce7f1b8e4ec6fb972888ce5fa5feaae925481a736ee5
3
+ size 34942
runs/Nov28_02-48-28_22ad9fb26287/events.out.tfevents.1732762117.22ad9fb26287.441.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:692d3d7942fdd3007986f0847b250da9843bb69a58d053ea50d181ddab85fe50
3
+ size 6711
runs/Nov28_02-49-34_22ad9fb26287/events.out.tfevents.1732762181.22ad9fb26287.441.2 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0f58e98fe10fbb5f5cbc2575305117e97bd6ca557847f15fd1beb6a7da271197
3
+ size 6711
runs/Nov28_02-50-02_22ad9fb26287/events.out.tfevents.1732762209.22ad9fb26287.441.3 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8f817508972583f205dbbe2a3028fc7aa08708d0f1852448c498c71a2a226e8e
3
+ size 16027
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:736f5dffbe983c6775f40acbea286c28d3f2784e24bd9dd9e598dac2fa8513ce
3
  size 5496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:63fd53472d1091fe8fdafa7cf00eb3dadce619301d7cc88ab51023275e8f33b6
3
  size 5496