--- base_model: stabilityai/StableBeluga-13B tags: - generated_from_trainer model-index: - name: PE-13b-lora results: [] --- # PE-13b-lora This model is a fine-tuned version of [stabilityai/StableBeluga-13B](https://huggingface.co/stabilityai/StableBeluga-13B) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.5704 - Rewards/chosen: 0.1581 - Rewards/rejected: -0.1076 - Rewards/accuracies: 0.9472 - Rewards/margins: 0.2658 - Logps/rejected: -73.1769 - Logps/chosen: -90.4042 - Logits/rejected: -1.7758 - Logits/chosen: -2.0462 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-07 - train_batch_size: 6 - eval_batch_size: 4 - seed: 42 - distributed_type: multi-GPU - num_devices: 8 - gradient_accumulation_steps: 2 - total_train_batch_size: 96 - total_eval_batch_size: 32 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 1 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:| | 0.693 | 0.07 | 100 | 0.6933 | -0.0008 | -0.0005 | 0.4889 | -0.0003 | -72.1053 | -91.9932 | -1.7861 | -2.0525 | | 0.69 | 0.14 | 200 | 0.6901 | 0.0031 | -0.0015 | 0.5611 | 0.0046 | -72.1153 | -91.9544 | -1.7859 | -2.0524 | | 0.6842 | 0.21 | 300 | 0.6832 | 0.0139 | -0.0056 | 0.6917 | 0.0195 | -72.1567 | -91.8467 | -1.7847 | -2.0513 | | 0.672 | 0.27 | 400 | 0.6718 | 0.0281 | -0.0131 | 0.8250 | 0.0412 | -72.2312 | -91.7049 | -1.7836 | -2.0504 | | 0.6563 | 0.34 | 500 | 0.6575 | 0.0498 | -0.0211 | 0.8861 | 0.0709 | -72.3116 | -91.4876 | -1.7821 | -2.0494 | | 0.6437 | 0.41 | 600 | 0.6416 | 0.0705 | -0.0340 | 0.9111 | 0.1044 | -72.4401 | -91.2810 | -1.7807 | -2.0486 | | 0.6261 | 0.48 | 700 | 0.6277 | 0.0885 | -0.0435 | 0.9250 | 0.1320 | -72.5355 | -91.1010 | -1.7796 | -2.0478 | | 0.6117 | 0.55 | 800 | 0.6127 | 0.1097 | -0.0567 | 0.9222 | 0.1664 | -72.6675 | -90.8891 | -1.7786 | -2.0474 | | 0.6002 | 0.62 | 900 | 0.6019 | 0.1226 | -0.0683 | 0.9278 | 0.1909 | -72.7836 | -90.7598 | -1.7777 | -2.0468 | | 0.5912 | 0.68 | 1000 | 0.5912 | 0.1344 | -0.0805 | 0.9333 | 0.2148 | -72.9053 | -90.6422 | -1.7770 | -2.0466 | | 0.5822 | 0.75 | 1100 | 0.5822 | 0.1441 | -0.0909 | 0.9472 | 0.2350 | -73.0092 | -90.5447 | -1.7763 | -2.0462 | | 0.5789 | 0.82 | 1200 | 0.5759 | 0.1517 | -0.0992 | 0.9333 | 0.2509 | -73.0923 | -90.4690 | -1.7763 | -2.0465 | | 0.5689 | 0.89 | 1300 | 0.5722 | 0.1555 | -0.1033 | 0.9500 | 0.2588 | -73.1332 | -90.4305 | -1.7762 | -2.0465 | | 0.5694 | 0.96 | 1400 | 0.5702 | 0.1579 | -0.1066 | 0.9417 | 0.2644 | -73.1662 | -90.4070 | -1.7761 | -2.0465 | ### Framework versions - Transformers 4.35.0 - Pytorch 2.1.1+cu121 - Datasets 2.14.6 - Tokenizers 0.14.1