dfurman commited on
Commit
42cd355
1 Parent(s): 1bd57a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +100 -1
README.md CHANGED
@@ -16,4 +16,103 @@ tags:
16
  inference: false
17
  model_creator: dfurman
18
  quantized_by: dfurman
19
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  inference: false
17
  model_creator: dfurman
18
  quantized_by: dfurman
19
+ ---
20
+
21
+
22
+ # dfurman/CalmeRys-78B-Orpo-v0.1
23
+
24
+ ## 🤖 Model
25
+
26
+ This model is a finetune of `MaziyarPanahi/calme-2.4-rys-78b` on 1.5k rows of the `mlabonne/orpo-dpo-mix-40k` dataset.
27
+
28
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62afc20ca5bd7cef3e1ab3f4/NG5WGL0ljzLsNhSBRVqnD.png)
29
+
30
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62afc20ca5bd7cef3e1ab3f4/Zhk5Bpr1I2NrzX98Bhtp8.png)
31
+
32
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62afc20ca5bd7cef3e1ab3f4/WgnKQnYIFWkCRSW3JPVAb.png)
33
+
34
+ You can find the experiment on W&B at [this address](https://wandb.ai/dryanfurman/huggingface/runs/1w50nu70?nw=nwuserdryanfurman).
35
+
36
+
37
+ ## 💻 Usage
38
+
39
+ <details>
40
+
41
+ <summary>Setup</summary>
42
+
43
+ ```python
44
+ !pip install -qU transformers accelerate bitsandbytes
45
+ !huggingface-cli download dfurman/CalmeRys-78B-Orpo-v0.1
46
+ ```
47
+
48
+ ```python
49
+ from transformers import AutoTokenizer, BitsAndBytesConfig
50
+ import transformers
51
+ import torch
52
+
53
+
54
+ if torch.cuda.get_device_capability()[0] >= 8:
55
+ !pip install -qqq flash-attn
56
+ attn_implementation = "flash_attention_2"
57
+ torch_dtype = torch.bfloat16
58
+ else:
59
+ attn_implementation = "eager"
60
+ torch_dtype = torch.float16
61
+
62
+ # quantize if necessary
63
+ # bnb_config = BitsAndBytesConfig(
64
+ # load_in_4bit=True,
65
+ # bnb_4bit_quant_type="nf4",
66
+ # bnb_4bit_compute_dtype=torch_dtype,
67
+ # bnb_4bit_use_double_quant=True,
68
+ # )
69
+
70
+ model = "dfurman/CalmeRys-78B-Orpo-v0.1"
71
+
72
+ tokenizer = AutoTokenizer.from_pretrained(model)
73
+ pipeline = transformers.pipeline(
74
+ "text-generation",
75
+ model=model,
76
+ model_kwargs={
77
+ "torch_dtype": torch_dtype,
78
+ # "quantization_config": bnb_config,
79
+ "device_map": "auto",
80
+ "attn_implementation": attn_implementation,
81
+ }
82
+ )
83
+ ```
84
+
85
+ </details>
86
+
87
+ ### Run
88
+
89
+ ```python
90
+ question = """The bakers at the Beverly Hills Bakery baked 200 loaves of bread on Monday morning.
91
+ They sold 93 loaves in the morning and 39 loaves in the afternoon.
92
+ A grocery store then returned 6 unsold loaves back to the bakery.
93
+ How many loaves of bread did the bakery have left?
94
+ Respond as succinctly as possible. Format the response as a completion of this table:
95
+ |step|subquestion|procedure|result|
96
+ |:---|:----------|:--------|:-----:|"""
97
+
98
+
99
+ messages = [
100
+ {"role": "system", "content": "You are a helpful assistant."},
101
+ {"role": "user", "content": question},
102
+ ]
103
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
104
+ # print("***Prompt:\n", prompt)
105
+
106
+ outputs = pipeline(prompt, max_new_tokens=1000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
107
+ print("***Generation:")
108
+ print(outputs[0]["generated_text"][len(prompt):])
109
+
110
+ ```
111
+
112
+ ```
113
+ ***Generation:
114
+ |1|Initial loaves|Start with total loaves|200|
115
+ |2|Sold in morning|Subtract morning sales|200 - 93 = 107|
116
+ |3|Sold in afternoon|Subtract afternoon sales|107 - 39 = 68|
117
+ |4|Returned loaves|Add returned loaves|68 + 6 = 74|
118
+ ```