nicholasKluge
/

RewardModel

Text Classification

preference model

Inference Endpoints

Model card Files Files and versions Community

nicholasKluge commited on Dec 28, 2023

Commit

a43e186

•

1 Parent(s): 3954523

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -27,7 +27,7 @@ co2_eq_emissions:
 ---
 # RewardModel
-The `RewardModel` is a [BERT](https://huggingface.co/bert-base-cased) model that can be used to score the quality of a completion for a given prompt.
 The model was trained with a dataset composed of `prompt`, `prefered_completions`, and `rejected_completions`.
@@ -48,7 +48,7 @@ This repository has the [source code](https://github.com/Nkluge-correa/Aira) use
 ## Usage
-Here's an example of how to use the `RewardModel` to score the quality of a response to a given prompt:
 ```python
 from transformers import AutoTokenizer, AutoModelForSequenceClassification
@@ -125,9 +125,9 @@ and bitching about what the machines do. Score: -10.942
 ## Performance
-| Acc  | [WebGPT](https://huggingface.co/datasets/openai/webgpt_comparisons)  |
-|---|---|
-| [Aira-RewardModel](https://huggingface.co/nicholasKluge/RewardModel)  | 55.02%*  |
 * *Only considering comparisons of the `webgpt_comparisons` dataset that had a preferred option.
@@ -149,4 +149,4 @@ and bitching about what the machines do. Score: -10.942
 ## License
-The `RewardModel` is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.

 ---
 # RewardModel
+The RewardModel is a [BERT](https://huggingface.co/bert-base-cased) model that can be used to score the quality of a completion for a given prompt.
 The model was trained with a dataset composed of `prompt`, `prefered_completions`, and `rejected_completions`.
 ## Usage
+Here's an example of how to use the RewardModel to score the quality of a response to a given prompt:
 ```python
 from transformers import AutoTokenizer, AutoModelForSequenceClassification
 ## Performance
+| Acc                                                                  | [WebGPT](https://huggingface.co/datasets/openai/webgpt_comparisons) |
+|----------------------------------------------------------------------|---------------------------------------------------------------------|
+| [Aira-RewardModel](https://huggingface.co/nicholasKluge/RewardModel) | 55.02%*                                                             |
 * *Only considering comparisons of the `webgpt_comparisons` dataset that had a preferred option.
 ## License
+RewardModel is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.