princeton-nlp commited on
Commit
16c27b0
1 Parent(s): c580d90

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -25
README.md CHANGED
@@ -61,32 +61,11 @@ Fine-tuning the [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-
61
 
62
  ## Evaluation
63
 
64
- ### Testing Data, Factors & Metrics
65
-
66
- #### Testing Data
67
-
68
- <!-- This should link to a Dataset Card if possible. -->
69
-
70
- [More Information Needed]
71
-
72
- #### Factors
73
-
74
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
75
-
76
- [More Information Needed]
77
-
78
- #### Metrics
79
-
80
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
81
-
82
- [More Information Needed]
83
-
84
- ### Results
85
-
86
- [More Information Needed]
87
-
88
- #### Summary
89
 
 
 
 
 
90
 
91
 
92
  ## Technical Specifications
 
61
 
62
  ## Evaluation
63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
 
65
+ | Model | AlpacaEval 2 LC Win Rate | AlpacaEval 2 Raw Win Rate | Arena-Hard Win Rate | WildBench Elo |
66
+ | :-------- | :------- | :------- | :------- | :------- |
67
+ | gemma-2-9b-it | 51.1 | 38.1 | 40.8 | 1049.5 |
68
+ | gemma-2-9b-it-SimPO | | | | |
69
 
70
 
71
  ## Technical Specifications