achal-tri commited on
Commit
6f7b089
1 Parent(s): 94356f2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -14
README.md CHANGED
@@ -60,25 +60,22 @@ Here are the evaluation results for DCLM-1B models on various tasks (using [llm-
60
  | DCLM-1B | 45.2 | 28.1 | 47.5 |
61
  | DCLM-1B-IT| 47.1 | 33.6 | 51.4 |
62
 
63
- Note: All scores are presented as decimal values between 0 and 1, representing the proportion of correct answers or the model's performance on each task.
64
-
65
- Moreover, we present our evaluation results on Length-Controlled Alpaca-Eval 2.0 to measure our instruction-following capabilities.
66
 
67
  | Model | AlpacaEval2.0 LC Win-rate (%) |
68
  |------------------------------------|------------------------------:|
69
- | **Our runs** | |
70
- | DCLM-IT-1B | **8.6** |
71
- | DCLM-IT-7B | 16.6 |
72
- | **Reported from the leaderboard** | |
73
- | Gemma-Instruct-7B | 10.4 |
74
- | Nous-Hermes-13B | 9.7 |
75
- | DaVinci001 | 9.0 |
76
- | LLaMA-2-Chat-13B | 8.4 |
77
- | Alpaca-7B | 5.9 |
78
  | Gemma-Instruct-2B | 5.4 |
79
  | Phi-2 SFT | 5.9 |
80
- | Qwen1.5 1.8B Chat | 2.6 |
81
- |--------------------------------------------------------------------|
 
 
 
 
 
 
82
 
83
  ## Example Code
84
 
 
60
  | DCLM-1B | 45.2 | 28.1 | 47.5 |
61
  | DCLM-1B-IT| 47.1 | 33.6 | 51.4 |
62
 
63
+ Moreover, we present our evaluation results on Length-Controlled Alpaca-Eval 2.0 to measure our instruction-following capabilities. We report results
64
+ from the leaderboard for non-DCLM models. We compare to state-of-the-art small models, and also include a few larger model sizes for comparison.
 
65
 
66
  | Model | AlpacaEval2.0 LC Win-rate (%) |
67
  |------------------------------------|------------------------------:|
68
+ | Qwen1.5 1.8B Chat | 2.6 |
 
 
 
 
 
 
 
 
69
  | Gemma-Instruct-2B | 5.4 |
70
  | Phi-2 SFT | 5.9 |
71
+ | DCLM-IT-1B | **8.6** |
72
+ | **Larger model sizes** | |
73
+ | Alpaca-7B | 5.9 |
74
+ | LLaMA-2-Chat-13B | 8.4 |
75
+ | DaVinci001 | 9.0 |
76
+ | Nous-Hermes-13B | 9.7 |
77
+ | Gemma-Instruct-7B | 10.4 |
78
+ | DCLM-IT-7B | 16.6 |
79
 
80
  ## Example Code
81