lapp0 commited on
Commit
aea2459
1 Parent(s): 7493cfc

End of training

Browse files
README.md CHANGED
@@ -150,6 +150,7 @@ The following hyperparameters were used during training:
150
  - seed: `42`
151
  - optimizer: `Adam with betas=(0.9,0.999) and epsilon=1e-08`
152
  - lr_scheduler_type: `polynomial`
 
153
  - num_epochs: `1.0`
154
  - distillation_objective: `DistillationObjective(
155
  logits_loss_component=LossComponent(
@@ -163,7 +164,7 @@ The following hyperparameters were used during training:
163
  weight=0
164
  )
165
  )`
166
- - lr_scheduler: `<torch.optim.lr_scheduler.LambdaLR object at 0x7786a35fef50>`
167
  - student_model_name_or_path: `None`
168
  - student_config_name_or_path: `None`
169
  - student_model_config: `{'num_hidden_layers': 15}`
@@ -187,7 +188,7 @@ The following hyperparameters were used during training:
187
  - gradient_accumulation_steps: `1`
188
  - weight_decay: `0.0`
189
  - max_grad_norm: `1.0`
190
- - warmup_ratio: `0.0`
191
  - warmup_steps: `0`
192
  - gradient_checkpointing: `True`
193
 
 
150
  - seed: `42`
151
  - optimizer: `Adam with betas=(0.9,0.999) and epsilon=1e-08`
152
  - lr_scheduler_type: `polynomial`
153
+ - lr_scheduler_warmup_ratio: `0.1`
154
  - num_epochs: `1.0`
155
  - distillation_objective: `DistillationObjective(
156
  logits_loss_component=LossComponent(
 
164
  weight=0
165
  )
166
  )`
167
+ - lr_scheduler: `<torch.optim.lr_scheduler.LambdaLR object at 0x778665672650>`
168
  - student_model_name_or_path: `None`
169
  - student_config_name_or_path: `None`
170
  - student_model_config: `{'num_hidden_layers': 15}`
 
188
  - gradient_accumulation_steps: `1`
189
  - weight_decay: `0.0`
190
  - max_grad_norm: `1.0`
191
+ - warmup_ratio: `0.1`
192
  - warmup_steps: `0`
193
  - gradient_checkpointing: `True`
194
 
logs/learning_rate=0.0002, lr_scheduler_kwargs=__power___0.7___lr_end___2e-05_, lr_scheduler_type=polynomial, per_device_train_batch_size=8, warmup_ratio=0.1/events.out.tfevents.1726983816.1c1a426a2fee CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fff703151a31e96231d287db94a7a51e1b65e9efa423d01eaf7fec1c0ef57ba1
3
- size 3301516
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5119150f80c74b99fe989d96f21849facb83c524f54e9dfe779272cf3c683c68
3
+ size 3432096
logs/learning_rate=0.0002, lr_scheduler_kwargs=__power___0.7___lr_end___2e-05_, lr_scheduler_type=polynomial, per_device_train_batch_size=8, warmup_ratio=0.1/events.out.tfevents.1727015278.1c1a426a2fee ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2535e906d8815196c7061bebbef993d7a2bb3aa7cdc5757627fd6b0092587ddd
3
+ size 529
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:01d21b81253f6b55bb25753810d41ee9b31ad1d91d4ebba2f34e5fe96359ace1
3
  size 325669528
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e7916a52bfee88fb1994cf3ebd4fa09fcdee6d685846d2f5540cb62504e44c8c
3
  size 325669528