jacobfulano
commited on
Commit
•
02a80c2
1
Parent(s):
515e294
Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,10 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
4 |
|
5 |
# MPT-7B (Base)
|
@@ -35,17 +40,17 @@ We demonstrate generations as long as 80k tokens on a single A100-80GB GPU in ou
|
|
35 |
* [MPT-7B-Instruct](https://huggingface.co/mosaicml/mpt-7b-instruct): a model for short-form instruction following.
|
36 |
It is built by finetuning MPT-7B on a [dataset](https://huggingface.co/datasets/sam-mosaic/dolly_hhrlhf) we also release, derived from the [Databricks Dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) and the [Anthropic Helpful and Harmless (HH-RLHF)](https://huggingface.co/datasets/Anthropic/hh-rlhf) datasets.
|
37 |
* License: _CC-By-SA-3.0_ (commercial use permitted)
|
38 |
-
* [Online Demo](https://huggingface.co/spaces/mosaicml/mpt-7b-instruct)
|
39 |
|
40 |
* [MPT-7B-Chat](TBD): a chatbot-like model for dialogue generation.
|
41 |
It is built by finetuning MPT-7B on the [ShareGPT-Vicuna](https://huggingface.co/datasets/jeffwan/sharegpt_vicuna), [HC3](https://huggingface.co/datasets/Hello-SimpleAI/HC3),
|
42 |
[Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca), [HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf), and [Evol-Instruct](https://huggingface.co/datasets/victor123/evol_instruct_70k) datasets.
|
43 |
* License: _CC-By-NC-SA-4.0_ (non-commercial use only)
|
44 |
-
* [Online Demo](https://huggingface.co/spaces/mosaicml/mpt-7b-chat)
|
45 |
|
46 |
## Model Date
|
47 |
|
48 |
-
May
|
49 |
|
50 |
## Model License
|
51 |
|
@@ -53,9 +58,9 @@ Apache-2.0 (commercial use permitted)
|
|
53 |
|
54 |
## Documentation
|
55 |
|
56 |
-
* [Blog post]
|
57 |
* [Codebase (mosaicml/llm-foundry repo)](https://github.com/mosaicml/llm-foundry/)
|
58 |
-
* Questions: contact us via the [MosaicML Community Slack](https://join.slack.com/t/mosaicml-community/shared_invite/zt-w0tiddn9-WGTlRpfjcO9J5jyrMub1dg)
|
59 |
|
60 |
|
61 |
## How to Use
|
@@ -166,19 +171,20 @@ While great efforts have been taken to clean the pretraining data, it is possibl
|
|
166 |
|
167 |
## Acknowledgements
|
168 |
|
|
|
169 |
We gratefully acknowledge the work of the researchers who created the [LLaMA series of models](https://arxiv.org/abs/2302.13971), which was the impetus for our efforts.
|
170 |
-
|
171 |
|
172 |
## Citation
|
173 |
|
174 |
Please cite this model using the following format:
|
175 |
|
176 |
```
|
177 |
-
@online{
|
178 |
author = {MosaicML NLP Team},
|
179 |
-
title = {
|
180 |
year = {2023},
|
181 |
-
url = {
|
182 |
note = {Accessed: 2023-03-28}, % change this date
|
183 |
urldate = {2023-03-28} % change this date
|
184 |
}
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
tags:
|
4 |
+
- Composer
|
5 |
+
- MosaicML
|
6 |
+
- llm-foundry
|
7 |
+
- StreamingDatasets
|
8 |
---
|
9 |
|
10 |
# MPT-7B (Base)
|
|
|
40 |
* [MPT-7B-Instruct](https://huggingface.co/mosaicml/mpt-7b-instruct): a model for short-form instruction following.
|
41 |
It is built by finetuning MPT-7B on a [dataset](https://huggingface.co/datasets/sam-mosaic/dolly_hhrlhf) we also release, derived from the [Databricks Dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) and the [Anthropic Helpful and Harmless (HH-RLHF)](https://huggingface.co/datasets/Anthropic/hh-rlhf) datasets.
|
42 |
* License: _CC-By-SA-3.0_ (commercial use permitted)
|
43 |
+
* [Online Demo on HuggingFace Spaces](https://huggingface.co/spaces/mosaicml/mpt-7b-instruct)
|
44 |
|
45 |
* [MPT-7B-Chat](TBD): a chatbot-like model for dialogue generation.
|
46 |
It is built by finetuning MPT-7B on the [ShareGPT-Vicuna](https://huggingface.co/datasets/jeffwan/sharegpt_vicuna), [HC3](https://huggingface.co/datasets/Hello-SimpleAI/HC3),
|
47 |
[Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca), [HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf), and [Evol-Instruct](https://huggingface.co/datasets/victor123/evol_instruct_70k) datasets.
|
48 |
* License: _CC-By-NC-SA-4.0_ (non-commercial use only)
|
49 |
+
* [Online Demo on HuggingFace Spaces](https://huggingface.co/spaces/mosaicml/mpt-7b-chat)
|
50 |
|
51 |
## Model Date
|
52 |
|
53 |
+
May 5, 2023
|
54 |
|
55 |
## Model License
|
56 |
|
|
|
58 |
|
59 |
## Documentation
|
60 |
|
61 |
+
* [Blog post: Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs](www.mosaicml.com/blog/mpt-7b)
|
62 |
* [Codebase (mosaicml/llm-foundry repo)](https://github.com/mosaicml/llm-foundry/)
|
63 |
+
* Questions: Feel free to contact us via the [MosaicML Community Slack](https://join.slack.com/t/mosaicml-community/shared_invite/zt-w0tiddn9-WGTlRpfjcO9J5jyrMub1dg)!
|
64 |
|
65 |
|
66 |
## How to Use
|
|
|
171 |
|
172 |
## Acknowledgements
|
173 |
|
174 |
+
We would like to thank our friends at AI2 for helping us to curate our pretraining dataset, choose a great tokenizer, and for many other helpful conversations along the way ⚔️
|
175 |
We gratefully acknowledge the work of the researchers who created the [LLaMA series of models](https://arxiv.org/abs/2302.13971), which was the impetus for our efforts.
|
176 |
+
and also acknowledge the hard work of the [Together](https://www.together.xyz) team, which put together the RedPajama dataset.
|
177 |
|
178 |
## Citation
|
179 |
|
180 |
Please cite this model using the following format:
|
181 |
|
182 |
```
|
183 |
+
@online{MosaicML2023Introducing,
|
184 |
author = {MosaicML NLP Team},
|
185 |
+
title = {Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs},
|
186 |
year = {2023},
|
187 |
+
url = {www.mosaicml.com/blog/mpt-7b},
|
188 |
note = {Accessed: 2023-03-28}, % change this date
|
189 |
urldate = {2023-03-28} % change this date
|
190 |
}
|