nicholasKluge commited on
Commit
a43e186
1 Parent(s): 3954523

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -27,7 +27,7 @@ co2_eq_emissions:
27
  ---
28
  # RewardModel
29
 
30
- The `RewardModel` is a [BERT](https://huggingface.co/bert-base-cased) model that can be used to score the quality of a completion for a given prompt.
31
 
32
  The model was trained with a dataset composed of `prompt`, `prefered_completions`, and `rejected_completions`.
33
 
@@ -48,7 +48,7 @@ This repository has the [source code](https://github.com/Nkluge-correa/Aira) use
48
 
49
  ## Usage
50
 
51
- Here's an example of how to use the `RewardModel` to score the quality of a response to a given prompt:
52
 
53
  ```python
54
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
@@ -125,9 +125,9 @@ and bitching about what the machines do. Score: -10.942
125
 
126
  ## Performance
127
 
128
- | Acc | [WebGPT](https://huggingface.co/datasets/openai/webgpt_comparisons) |
129
- |---|---|
130
- | [Aira-RewardModel](https://huggingface.co/nicholasKluge/RewardModel) | 55.02%* |
131
 
132
  * *Only considering comparisons of the `webgpt_comparisons` dataset that had a preferred option.
133
 
@@ -149,4 +149,4 @@ and bitching about what the machines do. Score: -10.942
149
 
150
  ## License
151
 
152
- The `RewardModel` is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.
 
27
  ---
28
  # RewardModel
29
 
30
+ The RewardModel is a [BERT](https://huggingface.co/bert-base-cased) model that can be used to score the quality of a completion for a given prompt.
31
 
32
  The model was trained with a dataset composed of `prompt`, `prefered_completions`, and `rejected_completions`.
33
 
 
48
 
49
  ## Usage
50
 
51
+ Here's an example of how to use the RewardModel to score the quality of a response to a given prompt:
52
 
53
  ```python
54
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
 
125
 
126
  ## Performance
127
 
128
+ | Acc | [WebGPT](https://huggingface.co/datasets/openai/webgpt_comparisons) |
129
+ |----------------------------------------------------------------------|---------------------------------------------------------------------|
130
+ | [Aira-RewardModel](https://huggingface.co/nicholasKluge/RewardModel) | 55.02%* |
131
 
132
  * *Only considering comparisons of the `webgpt_comparisons` dataset that had a preferred option.
133
 
 
149
 
150
  ## License
151
 
152
+ RewardModel is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.