dfurman
/

CalmeRys-78B-Orpo-v0.1

Text Generation

text-generation-inference

Model card Files Files and versions Community

dfurman commited on Sep 24

Commit

42cd355

•

1 Parent(s): 1bd57a0

Update README.md

Files changed (1) hide show

README.md +100 -1

README.md CHANGED Viewed

@@ -16,4 +16,103 @@ tags:
 inference: false
 model_creator: dfurman
 quantized_by: dfurman
----

 inference: false
 model_creator: dfurman
 quantized_by: dfurman
+---
+# dfurman/CalmeRys-78B-Orpo-v0.1
+## 🤖 Model
+This model is a finetune of `MaziyarPanahi/calme-2.4-rys-78b` on 1.5k rows of the `mlabonne/orpo-dpo-mix-40k` dataset.
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/62afc20ca5bd7cef3e1ab3f4/NG5WGL0ljzLsNhSBRVqnD.png)
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/62afc20ca5bd7cef3e1ab3f4/Zhk5Bpr1I2NrzX98Bhtp8.png)
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/62afc20ca5bd7cef3e1ab3f4/WgnKQnYIFWkCRSW3JPVAb.png)
+You can find the experiment on W&B at [this address](https://wandb.ai/dryanfurman/huggingface/runs/1w50nu70?nw=nwuserdryanfurman).
+## 💻 Usage
+<details>
+<summary>Setup</summary>
+```python
+!pip install -qU transformers accelerate bitsandbytes
+!huggingface-cli download dfurman/CalmeRys-78B-Orpo-v0.1
+```
+```python
+from transformers import AutoTokenizer, BitsAndBytesConfig
+import transformers
+import torch
+if torch.cuda.get_device_capability()[0] >= 8:
+    !pip install -qqq flash-attn
+    attn_implementation = "flash_attention_2"
+    torch_dtype = torch.bfloat16
+else:
+    attn_implementation = "eager"
+    torch_dtype = torch.float16
+# quantize if necessary
+# bnb_config = BitsAndBytesConfig(
+#    load_in_4bit=True,
+#    bnb_4bit_quant_type="nf4",
+#    bnb_4bit_compute_dtype=torch_dtype,
+#    bnb_4bit_use_double_quant=True,
+# )
+model = "dfurman/CalmeRys-78B-Orpo-v0.1"
+tokenizer = AutoTokenizer.from_pretrained(model)
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model,
+    model_kwargs={
+        "torch_dtype": torch_dtype,
+        # "quantization_config": bnb_config,
+        "device_map": "auto",
+        "attn_implementation": attn_implementation,
+    }
+)
+```
+</details>
+### Run
+```python
+question = """The bakers at the Beverly Hills Bakery baked 200 loaves of bread on Monday morning.
+They sold 93 loaves in the morning and 39 loaves in the afternoon.
+A grocery store then returned 6 unsold loaves back to the bakery.
+How many loaves of bread did the bakery have left?
+Respond as succinctly as possible. Format the response as a completion of this table:
+|step|subquestion|procedure|result|
+|:---|:----------|:--------|:-----:|"""
+messages = [
+    {"role": "system", "content": "You are a helpful assistant."},
+    {"role": "user", "content": question},
+]
+prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+# print("***Prompt:\n", prompt)
+outputs = pipeline(prompt, max_new_tokens=1000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
+print("***Generation:")
+print(outputs[0]["generated_text"][len(prompt):])
+```
+```
+***Generation:
+|1|Initial loaves|Start with total loaves|200|
+|2|Sold in morning|Subtract morning sales|200 - 93 = 107|
+|3|Sold in afternoon|Subtract afternoon sales|107 - 39 = 68|
+|4|Returned loaves|Add returned loaves|68 + 6 = 74|
+```