File size: 3,769 Bytes
66f70ae 0bf49b2 587610c 0bf49b2 587610c 0bf49b2 66f70ae 0bf49b2 c9cc9de 0bf49b2 c9cc9de 0bf49b2 c9cc9de 66f70ae 587610c 0bf49b2 587610c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
---
license: mit
library_name: peft
tags:
- trl
- sft
- generated_from_trainer
metrics:
- rouge
base_model: TheBloke/zephyr-7B-beta-GPTQ
model-index:
- name: zephyr-Me
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# zephyr-Me
This model is a fine-tuned version of [TheBloke/zephyr-7B-beta-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 1.0107
- Rouge1: 0.7127
- Rouge2: 0.4797
- Rougel: 0.6694
- Rougelsum: 0.6951
- Meteor: 0.7003
- F1 Score: 0.0010
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- num_epochs: 3
- mixed_precision_training: Native AMP
### Training results
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Meteor | F1 Score |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:------:|:--------:|
| 2.1273 | 0.15 | 4 | 2.0726 | 0.4931 | 0.1866 | 0.4166 | 0.4753 | 0.4834 | 0.0077 |
| 1.6907 | 0.3 | 8 | 1.5193 | 0.6035 | 0.3389 | 0.5483 | 0.5905 | 0.5816 | 0.0010 |
| 1.3096 | 0.44 | 12 | 1.3236 | 0.6571 | 0.4159 | 0.6080 | 0.6386 | 0.6473 | 0.0008 |
| 1.1588 | 0.59 | 16 | 1.2651 | 0.6652 | 0.4210 | 0.6174 | 0.6455 | 0.6528 | 0.0008 |
| 1.1038 | 0.74 | 20 | 1.1852 | 0.6772 | 0.4239 | 0.6274 | 0.6557 | 0.6570 | 0.0008 |
| 1.0362 | 0.89 | 24 | 1.1448 | 0.6750 | 0.4256 | 0.6278 | 0.6547 | 0.6613 | 0.0008 |
| 1.0733 | 1.04 | 28 | 1.1137 | 0.6864 | 0.4397 | 0.6379 | 0.6655 | 0.6743 | 0.0008 |
| 0.8783 | 1.19 | 32 | 1.1179 | 0.6914 | 0.4510 | 0.6430 | 0.6680 | 0.6813 | 0.0010 |
| 0.8761 | 1.33 | 36 | 1.1020 | 0.6984 | 0.4545 | 0.6497 | 0.6768 | 0.6865 | 0.0010 |
| 0.8774 | 1.48 | 40 | 1.0696 | 0.7033 | 0.4604 | 0.6549 | 0.6834 | 0.6908 | 0.0010 |
| 0.8621 | 1.63 | 44 | 1.0485 | 0.7030 | 0.4642 | 0.6568 | 0.6850 | 0.6915 | 0.0010 |
| 0.8143 | 1.78 | 48 | 1.0334 | 0.7064 | 0.4670 | 0.6601 | 0.6874 | 0.6929 | 0.0010 |
| 0.7483 | 1.93 | 52 | 1.0232 | 0.7060 | 0.4681 | 0.6606 | 0.6868 | 0.6940 | 0.0010 |
| 0.7647 | 2.07 | 56 | 1.0148 | 0.7058 | 0.4700 | 0.6623 | 0.6884 | 0.6886 | 0.0010 |
| 0.6659 | 2.22 | 60 | 1.0135 | 0.7088 | 0.4737 | 0.6655 | 0.6917 | 0.6952 | 0.0010 |
| 0.7135 | 2.37 | 64 | 1.0098 | 0.7132 | 0.4783 | 0.6699 | 0.6948 | 0.6989 | 0.0010 |
| 0.6685 | 2.52 | 68 | 1.0123 | 0.7116 | 0.4787 | 0.6687 | 0.6939 | 0.6995 | 0.0010 |
| 0.6538 | 2.67 | 72 | 1.0113 | 0.7145 | 0.4811 | 0.6705 | 0.6966 | 0.7030 | 0.0010 |
| 0.6648 | 2.81 | 76 | 1.0108 | 0.7132 | 0.4800 | 0.6694 | 0.6955 | 0.7011 | 0.0010 |
| 0.6278 | 2.96 | 80 | 1.0107 | 0.7127 | 0.4797 | 0.6694 | 0.6951 | 0.7003 | 0.0010 |
### Framework versions
- PEFT 0.7.1
- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.0
- Tokenizers 0.15.0 |