LoneStriker
/

Snorkel-Mistral-PairRM-DPO-4.0bpw-h6-exl2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

LoneStriker commited on Jan 26

Commit

df3a89f

•

1 Parent(s): 70aec39

Upload folder using huggingface_hub

Files changed (2) hide show

README.md +32 -0
output.safetensors +1 -1

README.md CHANGED Viewed

@@ -5,6 +5,34 @@ datasets:
 pipeline_tag: text-generation
 ---
 ### Dataset:
 Training dataset: [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset)
@@ -19,6 +47,10 @@ We utilize ONLY the prompts from [UltraFeedback](https://huggingface.co/datasets
 This overview provides a high-level summary of our approach.
 We plan to release more detailed results and findings in the coming weeks on the [Snorkel blog.](https://snorkel.ai/blog/)
 ### Training recipe:
 - The provided data is formatted to be compatible with the Hugging Face's [Zephyr recipe](https://github.com/huggingface/alignment-handbook/tree/main/recipes/zephyr-7b-beta).
 We executed the n_th DPO iteration using the "train/test_iteration_{n}".

 pipeline_tag: text-generation
 ---
+We offer a temporary HF space for everyone to try out the model: -> [**Snorkel-Mistral-PairRM-DPO Space**](https://huggingface.co/spaces/snorkelai/snorkelai_mistral_pairrm_dpo_text_inference)
+We also provide an inference endpoint for everyone to test the model.
+It may initially take a few minutes to activate, but will eventually operate at the standard speed of HF's 7B model text inference endpoint.
+The speed of inference depends on HF endpoint performance and is not related to Snorkel offerings.
+This endpoint is designed for initial trials, not for ongoing production use. Have fun!
+```
+import requests
+API_URL = "https://t1q6ks6fusyg1qq7.us-east-1.aws.endpoints.huggingface.cloud"
+headers = {
+	"Accept" : "application/json",
+	"Content-Type": "application/json"
+}
+def query(payload):
+	response = requests.post(API_URL, headers=headers, json=payload)
+	return response.json()
+output = query({
+	"inputs": "[INST] Recommend me some Hollywood movies [/INST]",
+	"parameters": {}
+})
+```
 ### Dataset:
 Training dataset: [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset)
 This overview provides a high-level summary of our approach.
 We plan to release more detailed results and findings in the coming weeks on the [Snorkel blog.](https://snorkel.ai/blog/)
+The prompt format follows the Mistral model:
+```[INST] {prompt} [/INST]```
 ### Training recipe:
 - The provided data is formatted to be compatible with the Hugging Face's [Zephyr recipe](https://github.com/huggingface/alignment-handbook/tree/main/recipes/zephyr-7b-beta).
 We executed the n_th DPO iteration using the "train/test_iteration_{n}".

output.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:228dc3a71f4d232fdacef771cc020f0ad5711b34c88d080861d22ef6db4fe9dc
 size 3861346868

 version https://git-lfs.github.com/spec/v1
+oid sha256:7918d9a4ececb4f3a0e9f2703b19161746833a8df0ecf9051c40034a9ffcd3f1
 size 3861346868