metadata
license: mit
library_name: peft
tags:
- trl
- sft
- generated_from_trainer
metrics:
- rouge
base_model: TheBloke/zephyr-7B-beta-GPTQ
model-index:
- name: zephyr-Me
results: []
zephyr-Me
This model is a fine-tuned version of TheBloke/zephyr-7B-beta-GPTQ on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.0107
- Rouge1: 0.7127
- Rouge2: 0.4797
- Rougel: 0.6694
- Rougelsum: 0.6951
- Meteor: 0.7003
- F1 Score: 0.0010
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- num_epochs: 3
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Meteor | F1 Score |
---|---|---|---|---|---|---|---|---|---|
2.1273 | 0.15 | 4 | 2.0726 | 0.4931 | 0.1866 | 0.4166 | 0.4753 | 0.4834 | 0.0077 |
1.6907 | 0.3 | 8 | 1.5193 | 0.6035 | 0.3389 | 0.5483 | 0.5905 | 0.5816 | 0.0010 |
1.3096 | 0.44 | 12 | 1.3236 | 0.6571 | 0.4159 | 0.6080 | 0.6386 | 0.6473 | 0.0008 |
1.1588 | 0.59 | 16 | 1.2651 | 0.6652 | 0.4210 | 0.6174 | 0.6455 | 0.6528 | 0.0008 |
1.1038 | 0.74 | 20 | 1.1852 | 0.6772 | 0.4239 | 0.6274 | 0.6557 | 0.6570 | 0.0008 |
1.0362 | 0.89 | 24 | 1.1448 | 0.6750 | 0.4256 | 0.6278 | 0.6547 | 0.6613 | 0.0008 |
1.0733 | 1.04 | 28 | 1.1137 | 0.6864 | 0.4397 | 0.6379 | 0.6655 | 0.6743 | 0.0008 |
0.8783 | 1.19 | 32 | 1.1179 | 0.6914 | 0.4510 | 0.6430 | 0.6680 | 0.6813 | 0.0010 |
0.8761 | 1.33 | 36 | 1.1020 | 0.6984 | 0.4545 | 0.6497 | 0.6768 | 0.6865 | 0.0010 |
0.8774 | 1.48 | 40 | 1.0696 | 0.7033 | 0.4604 | 0.6549 | 0.6834 | 0.6908 | 0.0010 |
0.8621 | 1.63 | 44 | 1.0485 | 0.7030 | 0.4642 | 0.6568 | 0.6850 | 0.6915 | 0.0010 |
0.8143 | 1.78 | 48 | 1.0334 | 0.7064 | 0.4670 | 0.6601 | 0.6874 | 0.6929 | 0.0010 |
0.7483 | 1.93 | 52 | 1.0232 | 0.7060 | 0.4681 | 0.6606 | 0.6868 | 0.6940 | 0.0010 |
0.7647 | 2.07 | 56 | 1.0148 | 0.7058 | 0.4700 | 0.6623 | 0.6884 | 0.6886 | 0.0010 |
0.6659 | 2.22 | 60 | 1.0135 | 0.7088 | 0.4737 | 0.6655 | 0.6917 | 0.6952 | 0.0010 |
0.7135 | 2.37 | 64 | 1.0098 | 0.7132 | 0.4783 | 0.6699 | 0.6948 | 0.6989 | 0.0010 |
0.6685 | 2.52 | 68 | 1.0123 | 0.7116 | 0.4787 | 0.6687 | 0.6939 | 0.6995 | 0.0010 |
0.6538 | 2.67 | 72 | 1.0113 | 0.7145 | 0.4811 | 0.6705 | 0.6966 | 0.7030 | 0.0010 |
0.6648 | 2.81 | 76 | 1.0108 | 0.7132 | 0.4800 | 0.6694 | 0.6955 | 0.7011 | 0.0010 |
0.6278 | 2.96 | 80 | 1.0107 | 0.7127 | 0.4797 | 0.6694 | 0.6951 | 0.7003 | 0.0010 |
Framework versions
- PEFT 0.7.1
- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.0
- Tokenizers 0.15.0