Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,16 @@ This model is a 7B Chinese version of [Self-RAG](https://huggingface.co/selfrag/
|
|
10 |
|
11 |
It is trained on Baichuan2-7B-Chat with a sample of [belle](https://github.com/LianjiaTech/BELLE) sft data, acompanying with interleaving passages from zhwiki. The reflection tokens are aligned with the original verison (in English), so the usage is the same. Hope you enjoy.
|
12 |
|
|
|
|
|
|
|
|
|
13 |
### Usage
|
|
|
|
|
|
|
|
|
|
|
14 |
I found some output errors while adopting vllm to accelerate the generation process and not sure whether it is due to some precision issues.
|
15 |
This may be owing to the implementation of vllm. Thus, I use the original generate method of transformers.
|
16 |
```
|
@@ -62,7 +71,3 @@ while True:
|
|
62 |
# Model prediction: [Retrieval] <paragraph> ... (this query requires factual grounding, call a retriever) </paragraph> [Relevant] 太和殿、中和殿、保和殿 [Utility:5] </s>
|
63 |
```
|
64 |
|
65 |
-
### Data
|
66 |
-
The data used to train the model is also available ([FINAL_OUTPUT_4w.jsonl](https://huggingface.co/Aman/selfrag-zh_baichuan2_7b_chat/blob/main/FINAL_OUTPUT_4w.jsonl)), which is constructed using [Belle](https://github.com/LianjiaTech/BELLE/tree/main/data/1.5M) SFT data and Wikipedia Chinese docs.
|
67 |
-
Hope you enjoy it!
|
68 |
-
|
|
|
10 |
|
11 |
It is trained on Baichuan2-7B-Chat with a sample of [belle](https://github.com/LianjiaTech/BELLE) sft data, acompanying with interleaving passages from zhwiki. The reflection tokens are aligned with the original verison (in English), so the usage is the same. Hope you enjoy.
|
12 |
|
13 |
+
|
14 |
+
### Data
|
15 |
+
The data used to train the model is also available ([FINAL_OUTPUT_4w.jsonl](https://huggingface.co/Aman/selfrag-zh_baichuan2_7b_chat/blob/main/FINAL_OUTPUT_4w.jsonl)), which is constructed using [Belle](https://github.com/LianjiaTech/BELLE/tree/main/data/1.5M) SFT data and Wikipedia Chinese docs.
|
16 |
+
|
17 |
### Usage
|
18 |
+
|
19 |
+
#### The Critic Model
|
20 |
+
The critic model is released at the `critic/` folder. However, due to the quantity and quality of the critic data, there is still a distance from a perfect performance.
|
21 |
+
|
22 |
+
#### The Generator
|
23 |
I found some output errors while adopting vllm to accelerate the generation process and not sure whether it is due to some precision issues.
|
24 |
This may be owing to the implementation of vllm. Thus, I use the original generate method of transformers.
|
25 |
```
|
|
|
71 |
# Model prediction: [Retrieval] <paragraph> ... (this query requires factual grounding, call a retriever) </paragraph> [Relevant] 太和殿、中和殿、保和殿 [Utility:5] </s>
|
72 |
```
|
73 |
|
|
|
|
|
|
|
|