Update README.md
Browse files
README.md
CHANGED
@@ -26,16 +26,16 @@ SteerLM Llama-2 is a 13 billion parameter generative language model based on the
|
|
26 |
|
27 |
Key capabilities enabled by SteerLM:
|
28 |
|
29 |
-
- Dynamic steering of responses by specifying desired attributes like quality, helpfulness, and toxicity
|
30 |
-
- Simplified training compared to RLHF techniques like fine-tuning and bootstrapping
|
31 |
|
32 |
## Model Architecture and Training
|
33 |
The SteerLM method involves the following key steps:
|
34 |
|
35 |
-
1. Train an attribute prediction model on human annotated data to evaluate response quality
|
36 |
-
2. Use this model to annotate diverse datasets and enrich training data
|
37 |
-
3. Perform conditioned fine-tuning to align responses with specified combinations of attributes
|
38 |
-
4. (Optionally) Bootstrap training through model sampling and further fine-tuning
|
39 |
|
40 |
SteerLM Llama-2 applies this technique on top of the Llama-2 architecture. It was pretrained on internet-scale data and then customized using [OASST](https://huggingface.co/datasets/OpenAssistant/oasst1) and [HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf) data.
|
41 |
|
|
|
26 |
|
27 |
Key capabilities enabled by SteerLM:
|
28 |
|
29 |
+
- Dynamic steering of responses by specifying desired attributes like quality, helpfulness, and toxicity.
|
30 |
+
- Simplified training compared to RLHF techniques like fine-tuning and bootstrapping.
|
31 |
|
32 |
## Model Architecture and Training
|
33 |
The SteerLM method involves the following key steps:
|
34 |
|
35 |
+
1. Train an attribute prediction model on human annotated data to evaluate response quality.
|
36 |
+
2. Use this model to annotate diverse datasets and enrich training data.
|
37 |
+
3. Perform conditioned fine-tuning to align responses with specified combinations of attributes.
|
38 |
+
4. (Optionally) Bootstrap training through model sampling and further fine-tuning.
|
39 |
|
40 |
SteerLM Llama-2 applies this technique on top of the Llama-2 architecture. It was pretrained on internet-scale data and then customized using [OASST](https://huggingface.co/datasets/OpenAssistant/oasst1) and [HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf) data.
|
41 |
|