metadata
base_model: stabilityai/StableBeluga-13B
tags:
- generated_from_trainer
model-index:
- name: PE-13b-lora
results: []
PE-13b-lora
This model is a fine-tuned version of stabilityai/StableBeluga-13B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.5704
- Rewards/chosen: 0.1581
- Rewards/rejected: -0.1076
- Rewards/accuracies: 0.9472
- Rewards/margins: 0.2658
- Logps/rejected: -73.1769
- Logps/chosen: -90.4042
- Logits/rejected: -1.7758
- Logits/chosen: -2.0462
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-07
- train_batch_size: 6
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 2
- total_train_batch_size: 96
- total_eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
---|---|---|---|---|---|---|---|---|---|---|---|
0.693 | 0.07 | 100 | 0.6933 | -0.0008 | -0.0005 | 0.4889 | -0.0003 | -72.1053 | -91.9932 | -1.7861 | -2.0525 |
0.69 | 0.14 | 200 | 0.6901 | 0.0031 | -0.0015 | 0.5611 | 0.0046 | -72.1153 | -91.9544 | -1.7859 | -2.0524 |
0.6842 | 0.21 | 300 | 0.6832 | 0.0139 | -0.0056 | 0.6917 | 0.0195 | -72.1567 | -91.8467 | -1.7847 | -2.0513 |
0.672 | 0.27 | 400 | 0.6718 | 0.0281 | -0.0131 | 0.8250 | 0.0412 | -72.2312 | -91.7049 | -1.7836 | -2.0504 |
0.6563 | 0.34 | 500 | 0.6575 | 0.0498 | -0.0211 | 0.8861 | 0.0709 | -72.3116 | -91.4876 | -1.7821 | -2.0494 |
0.6437 | 0.41 | 600 | 0.6416 | 0.0705 | -0.0340 | 0.9111 | 0.1044 | -72.4401 | -91.2810 | -1.7807 | -2.0486 |
0.6261 | 0.48 | 700 | 0.6277 | 0.0885 | -0.0435 | 0.9250 | 0.1320 | -72.5355 | -91.1010 | -1.7796 | -2.0478 |
0.6117 | 0.55 | 800 | 0.6127 | 0.1097 | -0.0567 | 0.9222 | 0.1664 | -72.6675 | -90.8891 | -1.7786 | -2.0474 |
0.6002 | 0.62 | 900 | 0.6019 | 0.1226 | -0.0683 | 0.9278 | 0.1909 | -72.7836 | -90.7598 | -1.7777 | -2.0468 |
0.5912 | 0.68 | 1000 | 0.5912 | 0.1344 | -0.0805 | 0.9333 | 0.2148 | -72.9053 | -90.6422 | -1.7770 | -2.0466 |
0.5822 | 0.75 | 1100 | 0.5822 | 0.1441 | -0.0909 | 0.9472 | 0.2350 | -73.0092 | -90.5447 | -1.7763 | -2.0462 |
0.5789 | 0.82 | 1200 | 0.5759 | 0.1517 | -0.0992 | 0.9333 | 0.2509 | -73.0923 | -90.4690 | -1.7763 | -2.0465 |
0.5689 | 0.89 | 1300 | 0.5722 | 0.1555 | -0.1033 | 0.9500 | 0.2588 | -73.1332 | -90.4305 | -1.7762 | -2.0465 |
0.5694 | 0.96 | 1400 | 0.5702 | 0.1579 | -0.1066 | 0.9417 | 0.2644 | -73.1662 | -90.4070 | -1.7761 | -2.0465 |
Framework versions
- Transformers 4.35.0
- Pytorch 2.1.1+cu121
- Datasets 2.14.6
- Tokenizers 0.14.1