--- license: other base_model: microsoft/phi-1_5 tags: - generated_from_trainer model-index: - name: phi-sft-outB results: [] --- [Built with Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) # phi-sft-outB This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.9402 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 3e-05 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 16 - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 100 - num_epochs: 4 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 0.9855 | 0.01 | 1 | 1.1349 | | 1.3387 | 0.2 | 18 | 1.1270 | | 1.1906 | 0.4 | 36 | 1.0901 | | 0.8854 | 0.6 | 54 | 1.0535 | | 1.1896 | 0.8 | 72 | 1.0300 | | 0.9865 | 1.0 | 90 | 1.0094 | | 1.1497 | 1.2 | 108 | 0.9901 | | 1.1192 | 1.4 | 126 | 0.9769 | | 0.8953 | 1.6 | 144 | 0.9651 | | 1.0513 | 1.81 | 162 | 0.9565 | | 0.9776 | 2.01 | 180 | 0.9512 | | 1.087 | 2.21 | 198 | 0.9473 | | 1.1714 | 2.41 | 216 | 0.9443 | | 0.8238 | 2.61 | 234 | 0.9423 | | 1.0734 | 2.81 | 252 | 0.9413 | | 0.8108 | 3.01 | 270 | 0.9406 | | 1.0202 | 3.21 | 288 | 0.9403 | | 1.134 | 3.41 | 306 | 0.9402 | | 0.8043 | 3.61 | 324 | 0.9401 | | 1.0807 | 3.81 | 342 | 0.9402 | ### Framework versions - Transformers 4.34.0.dev0 - Pytorch 2.0.0+cu118 - Datasets 2.14.5 - Tokenizers 0.14.0