princeton-nlp commited on
Commit
ebdb01f
1 Parent(s): 088ed5d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -7
README.md CHANGED
@@ -12,9 +12,7 @@ SimPO (Simple Preference Optimization) is an offline preference optimization alg
12
 
13
  ### Model Description
14
 
15
- We fine-tuned [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) on with the SimPO objective.
16
- , a preference optimization dataset where the prompts are from [HuggingFaceH4/ultrafeedback_binarized](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized)
17
-
18
 
19
  - **Developed by:** Yu Meng, Mengzhou Xia, Danqi Chen
20
  - **Model type:** Causal Language Model
@@ -34,8 +32,6 @@ We fine-tuned [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it
34
  ```
35
  import torch
36
  from transformers import pipeline
37
- import json
38
- import warnings
39
 
40
  model_id = "princeton-nlp/gemma-2-9b-it-SimPO"
41
 
@@ -45,7 +41,6 @@ generator = pipeline(
45
  model_kwargs={"torch_dtype": torch.bfloat16},
46
  device="cuda",
47
  )
48
- generator.tokenizer.chat_template = template
49
  outputs = generator([{"role": "user", "content": "What's the difference between llamas and alpacas?"}], do_sample=False, max_new_tokens=200)
50
  print(outputs[0]['generated_text'])
51
  ```
@@ -62,7 +57,7 @@ We use
62
 
63
  #### Speeds, Sizes, Times
64
 
65
- Fine-tuning the [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) on takes around 100 mins to finish on 8xH100 GPUs.
66
 
67
  ## Evaluation
68
 
 
12
 
13
  ### Model Description
14
 
15
+ We fine-tuned [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) on [princeton-nlp/gemma2-ultrafeedback-armorm](https://huggingface.co/datasets/princeton-nlp/gemma2-ultrafeedback-armorm) with the SimPO objective.
 
 
16
 
17
  - **Developed by:** Yu Meng, Mengzhou Xia, Danqi Chen
18
  - **Model type:** Causal Language Model
 
32
  ```
33
  import torch
34
  from transformers import pipeline
 
 
35
 
36
  model_id = "princeton-nlp/gemma-2-9b-it-SimPO"
37
 
 
41
  model_kwargs={"torch_dtype": torch.bfloat16},
42
  device="cuda",
43
  )
 
44
  outputs = generator([{"role": "user", "content": "What's the difference between llamas and alpacas?"}], do_sample=False, max_new_tokens=200)
45
  print(outputs[0]['generated_text'])
46
  ```
 
57
 
58
  #### Speeds, Sizes, Times
59
 
60
+ Fine-tuning the [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) on [princeton-nlp/gemma2-ultrafeedback-armorm](https://huggingface.co/datasets/princeton-nlp/gemma2-ultrafeedback-armorm) takes around 100 mins to finish on 8xH100 GPUs.
61
 
62
  ## Evaluation
63