--- license: mit library_name: peft tags: - trl - sft - generated_from_trainer metrics: - rouge base_model: TheBloke/zephyr-7B-beta-GPTQ model-index: - name: zephyr-Me results: [] --- # zephyr-Me This model is a fine-tuned version of [TheBloke/zephyr-7B-beta-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.0107 - Rouge1: 0.7127 - Rouge2: 0.4797 - Rougel: 0.6694 - Rougelsum: 0.6951 - Meteor: 0.7003 - F1 Score: 0.0010 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 16 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - num_epochs: 3 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Meteor | F1 Score | |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:------:|:--------:| | 2.1273 | 0.15 | 4 | 2.0726 | 0.4931 | 0.1866 | 0.4166 | 0.4753 | 0.4834 | 0.0077 | | 1.6907 | 0.3 | 8 | 1.5193 | 0.6035 | 0.3389 | 0.5483 | 0.5905 | 0.5816 | 0.0010 | | 1.3096 | 0.44 | 12 | 1.3236 | 0.6571 | 0.4159 | 0.6080 | 0.6386 | 0.6473 | 0.0008 | | 1.1588 | 0.59 | 16 | 1.2651 | 0.6652 | 0.4210 | 0.6174 | 0.6455 | 0.6528 | 0.0008 | | 1.1038 | 0.74 | 20 | 1.1852 | 0.6772 | 0.4239 | 0.6274 | 0.6557 | 0.6570 | 0.0008 | | 1.0362 | 0.89 | 24 | 1.1448 | 0.6750 | 0.4256 | 0.6278 | 0.6547 | 0.6613 | 0.0008 | | 1.0733 | 1.04 | 28 | 1.1137 | 0.6864 | 0.4397 | 0.6379 | 0.6655 | 0.6743 | 0.0008 | | 0.8783 | 1.19 | 32 | 1.1179 | 0.6914 | 0.4510 | 0.6430 | 0.6680 | 0.6813 | 0.0010 | | 0.8761 | 1.33 | 36 | 1.1020 | 0.6984 | 0.4545 | 0.6497 | 0.6768 | 0.6865 | 0.0010 | | 0.8774 | 1.48 | 40 | 1.0696 | 0.7033 | 0.4604 | 0.6549 | 0.6834 | 0.6908 | 0.0010 | | 0.8621 | 1.63 | 44 | 1.0485 | 0.7030 | 0.4642 | 0.6568 | 0.6850 | 0.6915 | 0.0010 | | 0.8143 | 1.78 | 48 | 1.0334 | 0.7064 | 0.4670 | 0.6601 | 0.6874 | 0.6929 | 0.0010 | | 0.7483 | 1.93 | 52 | 1.0232 | 0.7060 | 0.4681 | 0.6606 | 0.6868 | 0.6940 | 0.0010 | | 0.7647 | 2.07 | 56 | 1.0148 | 0.7058 | 0.4700 | 0.6623 | 0.6884 | 0.6886 | 0.0010 | | 0.6659 | 2.22 | 60 | 1.0135 | 0.7088 | 0.4737 | 0.6655 | 0.6917 | 0.6952 | 0.0010 | | 0.7135 | 2.37 | 64 | 1.0098 | 0.7132 | 0.4783 | 0.6699 | 0.6948 | 0.6989 | 0.0010 | | 0.6685 | 2.52 | 68 | 1.0123 | 0.7116 | 0.4787 | 0.6687 | 0.6939 | 0.6995 | 0.0010 | | 0.6538 | 2.67 | 72 | 1.0113 | 0.7145 | 0.4811 | 0.6705 | 0.6966 | 0.7030 | 0.0010 | | 0.6648 | 2.81 | 76 | 1.0108 | 0.7132 | 0.4800 | 0.6694 | 0.6955 | 0.7011 | 0.0010 | | 0.6278 | 2.96 | 80 | 1.0107 | 0.7127 | 0.4797 | 0.6694 | 0.6951 | 0.7003 | 0.0010 | ### Framework versions - PEFT 0.7.1 - Transformers 4.35.2 - Pytorch 2.1.0+cu121 - Datasets 2.16.0 - Tokenizers 0.15.0