Llama-2-7b-spin-rephrased-10k / README.md

Model save

57dbea5 verified about 1 month ago

4.78 kB

	---
	license: llama2
	base_model: meta-llama/Llama-2-7b-hf
	tags:
	- generated_from_trainer
	model-index:
	- name: Llama-2-7b-spin-rephrased-10k
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Llama-2-7b-spin-rephrased-10k

	This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1071
	- Rewards/real: 10.2171
	- Rewards/generated: -7.6243
	- Rewards/accuracies: 1.0
	- Rewards/margins: 17.8413
	- Logps/generated: -358.9117
	- Logps/real: -104.6875
	- Logits/generated: -0.8781
	- Logits/real: -1.4494

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-07
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 4
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 32
	- total_eval_batch_size: 16
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rewards/real \| Rewards/generated \| Rewards/accuracies \| Rewards/margins \| Logps/generated \| Logps/real \| Logits/generated \| Logits/real \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|:------------:\|:-----------------:\|:------------------:\|:---------------:\|:---------------:\|:----------:\|:----------------:\|:-----------:\|
	\| 0.1687 \| 0.1984 \| 62 \| 0.1554 \| 5.2053 \| -5.2548 \| 1.0 \| 10.4601 \| -335.2168 \| -154.8048 \| -0.7218 \| -0.4019 \|
	\| 0.1204 \| 0.3968 \| 124 \| 0.1153 \| 9.3697 \| -4.5235 \| 1.0 \| 13.8932 \| -327.9041 \| -113.1613 \| -0.8262 \| -1.1627 \|
	\| 0.1114 \| 0.5952 \| 186 \| 0.1125 \| 9.6740 \| -5.3166 \| 1.0 \| 14.9906 \| -335.8354 \| -110.1185 \| -0.8446 \| -1.2393 \|
	\| 0.1094 \| 0.7936 \| 248 \| 0.1110 \| 9.8335 \| -5.4853 \| 1.0 \| 15.3188 \| -337.5219 \| -108.5231 \| -0.8538 \| -1.2560 \|
	\| 0.1115 \| 0.992 \| 310 \| 0.1100 \| 9.9127 \| -6.4827 \| 1.0 \| 16.3954 \| -347.4966 \| -107.7317 \| -0.8658 \| -1.3304 \|
	\| 0.1046 \| 1.1904 \| 372 \| 0.1093 \| 9.9819 \| -6.6707 \| 1.0 \| 16.6526 \| -349.3765 \| -107.0395 \| -0.8656 \| -1.3633 \|
	\| 0.1067 \| 1.3888 \| 434 \| 0.1089 \| 10.0127 \| -7.5740 \| 1.0 \| 17.5868 \| -358.4094 \| -106.7308 \| -0.8814 \| -1.3898 \|
	\| 0.1038 \| 1.5872 \| 496 \| 0.1083 \| 10.0730 \| -7.0038 \| 1.0 \| 17.0768 \| -352.7069 \| -106.1281 \| -0.8755 \| -1.3615 \|
	\| 0.0996 \| 1.7856 \| 558 \| 0.1079 \| 10.1219 \| -7.0176 \| 1.0 \| 17.1396 \| -352.8456 \| -105.6391 \| -0.8467 \| -1.3431 \|
	\| 0.1058 \| 1.984 \| 620 \| 0.1077 \| 10.1479 \| -7.4808 \| 1.0 \| 17.6287 \| -357.4770 \| -105.3797 \| -0.8821 \| -1.4055 \|
	\| 0.0995 \| 2.1824 \| 682 \| 0.1074 \| 10.1669 \| -7.1947 \| 1.0 \| 17.3617 \| -354.6166 \| -105.1890 \| -0.8781 \| -1.4102 \|
	\| 0.1017 \| 2.3808 \| 744 \| 0.1073 \| 10.1849 \| -7.6243 \| 1.0 \| 17.8092 \| -358.9117 \| -105.0093 \| -0.8806 \| -1.4228 \|
	\| 0.1031 \| 2.5792 \| 806 \| 0.1072 \| 10.2106 \| -7.6581 \| 1.0 \| 17.8687 \| -359.2500 \| -104.7519 \| -0.8787 \| -1.4391 \|
	\| 0.1025 \| 2.7776 \| 868 \| 0.1071 \| 10.2105 \| -7.6804 \| 1.0 \| 17.8909 \| -359.4730 \| -104.7534 \| -0.8824 \| -1.4506 \|
	\| 0.1067 \| 2.976 \| 930 \| 0.1071 \| 10.2171 \| -7.6243 \| 1.0 \| 17.8413 \| -358.9117 \| -104.6875 \| -0.8781 \| -1.4494 \|


	### Framework versions

	- Transformers 4.43.3
	- Pytorch 2.2.2+cu121
	- Datasets 2.20.0
	- Tokenizers 0.19.1