allenai
/

llama-3-tulu-2-70b-uf-mean-rm

Text Classification

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

hamishivi commited on Jun 24

Commit

6233b52

•

1 Parent(s): 2826168

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -32,7 +32,7 @@ We evaluate the model on [RewardBench](https://github.com/allenai/reward-bench):
 | Model            | Score | Chat  | Chat Hard | Safety | Reasoning |
 |------------------|-------|-------|-----------|--------|-----------|
 | [Llama 3 Tulu 2 8b UF RM](https://huggingface.co/allenai/llama-3-tulu-2-8b-uf-mean-rm) | 73.6  | 95.3  |    59.2   |  57.9  |    82.1   |
-| **[Llama 3 Tulu 2 70b UF RM](https://huggingface.co/allenai/llama-3-tulu-2-70b-uf-mean-rm) (this model)** |    73.5   |   89.1    |      52.6     |    64.0    |      88.3     |

 | Model            | Score | Chat  | Chat Hard | Safety | Reasoning |
 |------------------|-------|-------|-----------|--------|-----------|
 | [Llama 3 Tulu 2 8b UF RM](https://huggingface.co/allenai/llama-3-tulu-2-8b-uf-mean-rm) | 73.6  | 95.3  |    59.2   |  57.9  |    82.1   |
+| **[Llama 3 Tulu 2 70b UF RM](https://huggingface.co/allenai/llama-3-tulu-2-70b-uf-mean-rm) (this model)** |    71.0 | 86.3 | 56.1 | 58.9 | 82.7 |