Update README.md
Browse files
README.md
CHANGED
@@ -34,7 +34,7 @@ For a detailed exposition, please refer to our accompanying technical report.
|
|
34 |
| **Ours** | | | | | |
|
35 |
| Ours (SFT baseline) | 8B | SFT | 10.2 | 7.69 | 5.6 |
|
36 |
| Ours (DPO baseline) | 8B | Vanilla DPO | 22.5 | 8.17 | 22.4 |
|
37 |
-
| Ours (Online RLHF) | 8B | Iterative DPO | **
|
38 |
| **Large Open-Sourced Models** | | | | | |
|
39 |
| Vicuna-33b-v1.3 | 33B | SFT | 17.6 | 7.12 | 8.6 |
|
40 |
| Yi-34B-Chat | 34B | SFT | 27.2 | - | 23.1 |
|
|
|
34 |
| **Ours** | | | | | |
|
35 |
| Ours (SFT baseline) | 8B | SFT | 10.2 | 7.69 | 5.6 |
|
36 |
| Ours (DPO baseline) | 8B | Vanilla DPO | 22.5 | 8.17 | 22.4 |
|
37 |
+
| Ours (Online RLHF) | 8B | Iterative DPO | **31.3** | **8.46** | **29.1** |
|
38 |
| **Large Open-Sourced Models** | | | | | |
|
39 |
| Vicuna-33b-v1.3 | 33B | SFT | 17.6 | 7.12 | 8.6 |
|
40 |
| Yi-34B-Chat | 34B | SFT | 27.2 | - | 23.1 |
|