FsfairX-Zephyr-Chat-v0.1 / README.md

hendrydong

Update README.md

e585ddb verified 5 months ago

preview code

raw

history blame

No virus

4.03 kB

	---
	license: cc-by-sa-4.0
	---

	This model is the RLHF version of `HuggingFaceH4/mistral-7b-sft-beta` without any external responses.

	We perform GSHF algorithm on SFT baseline. The external signals include (1) Reward model; (2) AI-generated Prompts.


	We obtain 35.95% win-rate (34.79% LC win-rate) on Alpaca Eval v2. The win-rate of the base model is only 4.63%.

	For MT-bench, it obtained about 7.5, where the base model is only 5.3.

	We have demonstrated the significant potential of the iterative RLHF algorithm for LLMs to deliver appropriate and well-structured responses,
	even without any external responses.


	## Model Details

	We perform 3 iterations of GSHF algorithm on `HuggingFaceH4/mistral-7b-sft-beta` labeled by reward model, where prompts are generated by ChatGPT with self-instruct type prompt augmentation.

	We use AI-generated 60K prompts in the training process.

	Examples are as below,
	```json
	{"prompt": "Why is gold considered a good reserve asset for central banks?"}
	{"prompt": "What are the top 5 yoga poses for stress relief?"}
	{"prompt": "Craft a blog title about the health implications of eating avocados daily based on their caloric value."}
	{"prompt": "Design a simple HTML chat interface that simulates a conversation between a user and a bot, displaying two messages from each."}
	{"prompt": "List 10 names from different cultures that embody the meanings of peace, harmony, or compassion."}
	```

	## Uses

	The usage and chat template format follow the SFT model `HuggingFaceH4/mistral-7b-sft-beta`.

	```python
	# Install transformers from source - only needed for versions <= v4.34
	# pip install git+https://github.com/huggingface/transformers.git
	# pip install accelerate

	import torch
	from transformers import pipeline

	pipe = pipeline("text-generation", model="sfairXC/FsfairX-Zephyr-Chat-v0.1", torch_dtype=torch.bfloat16, device_map="auto")

	# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
	messages = [
	{"role": "system", "content": "You are a friendly chatbot who always responds in the style of a pirate"},
	{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
	]
	prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
	print(outputs[0]["generated_text"])
	# <\|system\|>
	# You are a friendly chatbot who always responds in the style of a pirate.</s>
	# <\|user\|>
	# How many helicopters can a human eat in one sitting?</s>
	# <\|assistant\|>
	# Ah, me hearty matey! But yer question be a puzzler! A human cannot eat a helicopter in one sitting, as helicopters are not edible. They be made of metal, plastic, and other materials, not food!
	```




	## Evaluation

	The evaluation on Alpaca Eval v2 are provided as below,

	\| Model \| Win Rate \| LC Win Rate \| Avg Length \|
	\|-------------\|----------\|-------------\|------------\|
	\| Base \| 4.63 \| 8.01 \| 916 \|
	\| Iteration 1 \| 13.26 \| 20.81 \| 1205 \|
	\| Iteration 2 \| 23.57 \| 27.63 \| 1623 \|
	\| Iteration 3 \| 35.95 \| 34.79 \| 2275 \|


	## Citation



	If you found this helpful, please cite the following papers.

	```bibtex
	@article{dong2023raft,
	title={Raft: Reward ranked finetuning for generative foundation model alignment},
	author={Dong, Hanze and Xiong, Wei and Goyal, Deepanshu and Pan, Rui and Diao, Shizhe and Zhang, Jipeng and Shum, Kashun and Zhang, Tong},
	journal={arXiv preprint arXiv:2304.06767},
	year={2023}
	}

	@misc{xiong2024iterative,
	title={Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint},
	author={Wei Xiong and Hanze Dong and Chenlu Ye and Ziqi Wang and Han Zhong and Heng Ji and Nan Jiang and Tong Zhang},
	year={2024},
	eprint={2312.11456},
	archivePrefix={arXiv},
	primaryClass={cs.LG}
	}
	```