Update dataset details in model card

#11
by imone - opened
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -147,6 +147,16 @@ OpenChat may sometimes generate harmful, hate speech, biased responses, or answe
147
 
148
  Our OpenChat 3.5 code and models are distributed under the Apache License 2.0.
149
 
 
 
 
 
 
 
 
 
 
 
150
  ## Citation
151
 
152
  ```
@@ -160,7 +170,7 @@ Our OpenChat 3.5 code and models are distributed under the Apache License 2.0.
160
 
161
  ## Acknowledgements
162
 
163
- We extend our heartfelt gratitude to Alignment Lab AI, Nous Research, and Pygmalion AI for their substantial contributions to data collection and model training.
164
 
165
  Special thanks go to Changling Liu from GPT Desk Pte. Ltd., Qiying Yu at Tsinghua University, Baochang Ma, and Hao Wan from 01.AI company for their generous provision of resources. We are also deeply grateful to Jianxiong Li and Peng Li at Tsinghua University for their insightful discussions.
166
 
 
147
 
148
  Our OpenChat 3.5 code and models are distributed under the Apache License 2.0.
149
 
150
+ ## Dataset Details
151
+
152
+ OpenChat 3.5 was trained with C-RLFT on a collection of publicly available high-quality instruction data, with a custom processing pipeline. We detail some notable subsets included here:
153
+
154
+ - [OpenChat ShareGPT](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset)
155
+ - [Open-Orca](https://huggingface.co/datasets/Open-Orca/OpenOrca)
156
+ - Capybara [1](https://huggingface.co/datasets/LDJnr/Pure-Dove) [2](https://huggingface.co/datasets/LDJnr/Verified-Camel) [3](https://huggingface.co/datasets/LDJnr/LessWrong-Amplify-Instruct)
157
+ - [GOAT](https://huggingface.co/datasets/tiedong/goat)
158
+ - [Glaive](https://huggingface.co/datasets/glaiveai/glaive-code-assistant)
159
+
160
  ## Citation
161
 
162
  ```
 
170
 
171
  ## Acknowledgements
172
 
173
+ We extend our heartfelt gratitude to AutoMeta and caesus from Alignment Lab AI, LDJ and Teknium from Nous Research, alpin and TearGosling from Pygmalion AI for their substantial contributions to data collection and model training.
174
 
175
  Special thanks go to Changling Liu from GPT Desk Pte. Ltd., Qiying Yu at Tsinghua University, Baochang Ma, and Hao Wan from 01.AI company for their generous provision of resources. We are also deeply grateful to Jianxiong Li and Peng Li at Tsinghua University for their insightful discussions.
176