andreaskoepf commited on
Commit
6d3663c
1 Parent(s): 5fa78b1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -1
README.md CHANGED
@@ -4,4 +4,46 @@ license: apache-2.0
4
  - wandb: https://wandb.ai/open-assistant/reward-model/runs/5xld9wmd
5
  - checkpoint: 3500 steps
6
 
7
- Compute was generously provided by [Stability AI](https://stability.ai/)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - wandb: https://wandb.ai/open-assistant/reward-model/runs/5xld9wmd
5
  - checkpoint: 3500 steps
6
 
7
+ Compute was generously provided by [Stability AI](https://stability.ai/)
8
+
9
+
10
+
11
+ ### How to use
12
+
13
+ ```python
14
+ # install open assistant model_training module (e.g. run `pip install -e .` in `model/` directory of open-assistant repository)
15
+ import model_training.models.reward_model # noqa: F401 (registers reward model for AutoModel loading)
16
+
17
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
18
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
19
+ input_text = "<|prompter|>Hi how are you?<|endoftext|><|assistant|>Hi, I am Open-Assistant a large open-source language model trained by LAION AI. How can I help you today?<|endoftext|>"
20
+ inputs = tokenizer(input_text, return_tensors="pt")
21
+ score = rm(**inputs).logits[0].cpu().detach()
22
+ print(score)
23
+ ```
24
+
25
+ ### Datasets
26
+
27
+ ```
28
+ datasets:
29
+ - oasst_export:
30
+ lang: "en,es,de,fr"
31
+ input_file_path: 2023-03-27_oasst_research_ready_synth.jsonl.gz
32
+ val_split: 0.1
33
+ - anthropic_rlhf:
34
+ fraction: 0.1
35
+ max_val_set: 1000
36
+ - shp:
37
+ max_val_set: 1000
38
+ - hellaswag:
39
+ fraction: 0.5
40
+ max_val_set: 1000
41
+ - webgpt:
42
+ val_split: 0.05
43
+ max_val_set: 1000
44
+ - hf_summary_pairs:
45
+ fraction: 0.1
46
+ max_val_set: 250
47
+ ```
48
+
49
+