princeton-nlp
commited on
Commit
•
05ab6c8
1
Parent(s):
84154c5
Update README.md
Browse files
README.md
CHANGED
@@ -5,8 +5,7 @@ tags: []
|
|
5 |
|
6 |
# Model Card for Model ID
|
7 |
|
8 |
-
|
9 |
-
|
10 |
|
11 |
|
12 |
## Model Details
|
@@ -15,63 +14,24 @@ tags: []
|
|
15 |
|
16 |
<!-- Provide a longer summary of what this model is. -->
|
17 |
|
18 |
-
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
19 |
|
20 |
-
- **Developed by:**
|
21 |
-
- **
|
22 |
-
- **
|
23 |
-
- **
|
24 |
-
- **Language(s) (NLP):** [More Information Needed]
|
25 |
-
- **License:** [More Information Needed]
|
26 |
-
- **Finetuned from model [optional]:** [More Information Needed]
|
27 |
|
28 |
-
### Model Sources
|
29 |
|
30 |
<!-- Provide the basic links for the model. -->
|
31 |
|
32 |
-
- **Repository:**
|
33 |
-
- **Paper
|
34 |
-
- **Demo
|
35 |
-
|
36 |
-
## Uses
|
37 |
-
|
38 |
-
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
39 |
-
|
40 |
-
### Direct Use
|
41 |
-
|
42 |
-
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
|
43 |
-
|
44 |
-
[More Information Needed]
|
45 |
-
|
46 |
-
### Downstream Use [optional]
|
47 |
-
|
48 |
-
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
|
49 |
-
|
50 |
-
[More Information Needed]
|
51 |
-
|
52 |
-
### Out-of-Scope Use
|
53 |
-
|
54 |
-
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
|
55 |
-
|
56 |
-
[More Information Needed]
|
57 |
-
|
58 |
-
## Bias, Risks, and Limitations
|
59 |
|
60 |
-
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
61 |
-
|
62 |
-
[More Information Needed]
|
63 |
-
|
64 |
-
### Recommendations
|
65 |
-
|
66 |
-
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
|
67 |
-
|
68 |
-
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
|
69 |
|
70 |
## How to Get Started with the Model
|
71 |
|
72 |
-
Use the code below to get started with the model.
|
73 |
|
74 |
-
[More Information Needed]
|
75 |
|
76 |
## Training Details
|
77 |
|
|
|
5 |
|
6 |
# Model Card for Model ID
|
7 |
|
8 |
+
SimPO (Simple Preference Optimization) is an offline preference optimization algorithm designed to enhance the training of large language models (LLMs) with preference optimization datasets. SimPO aligns the reward function with the generation likelihood, eliminating the need for a reference model and incorporating a target reward margin to boost performance. Please refer to our [preprint](https://arxiv.org/pdf/2405.14734) and [github repo](https://github.com/princeton-nlp/SimPO) for more details.
|
|
|
9 |
|
10 |
|
11 |
## Model Details
|
|
|
14 |
|
15 |
<!-- Provide a longer summary of what this model is. -->
|
16 |
|
|
|
17 |
|
18 |
+
- **Developed by:** Yu Meng, Mengzhou Xia
|
19 |
+
- **Model type:** Causal Language Model
|
20 |
+
- **License:** gemma
|
21 |
+
- **Finetuned from model:** [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it)
|
|
|
|
|
|
|
22 |
|
23 |
+
### Model Sources
|
24 |
|
25 |
<!-- Provide the basic links for the model. -->
|
26 |
|
27 |
+
- **Repository:** https://github.com/princeton-nlp/SimPO
|
28 |
+
- **Paper:** https://arxiv.org/pdf/2405.14734
|
29 |
+
- **Demo:** Soon to be alive
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
31 |
|
32 |
## How to Get Started with the Model
|
33 |
|
|
|
34 |
|
|
|
35 |
|
36 |
## Training Details
|
37 |
|