AlessandroW
commited on
Commit
•
59465a9
1
Parent(s):
835ec48
Update README.md
Browse files
README.md
CHANGED
@@ -18,7 +18,7 @@ This repo provides the GGUF format for the Phi-3-Mini-128K-Instruct.
|
|
18 |
*For more details check out the original model at [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct).*
|
19 |
|
20 |
|
21 |
-
|
22 |
|
23 |
After initial training, the model underwent a post-training process that involved supervised fine-tuning and direct preference optimization to enhance its ability to follow instructions and adhere to safety measures. When evaluated against benchmarks that test common sense, language understanding, mathematics, coding, long-term context, and logical reasoning, the Phi-3 Mini-128K-Instruct demonstrated robust and state-of-the-art performance among models with fewer than 13 billion parameters. Resources and Technical Documentation:
|
24 |
|
@@ -42,8 +42,4 @@ This repo provides GGUF files and Llamafiles ([`d228e01d`](https://github.com/Mo
|
|
42 |
|
43 |
### License
|
44 |
|
45 |
-
The model is licensed under the [MIT license](https://huggingface.co/microsoft/phi-3-mini-128k/resolve/main/LICENSE).
|
46 |
-
|
47 |
-
## Trademarks
|
48 |
-
|
49 |
-
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow [Microsoft’s Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks). Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party’s policies.
|
|
|
18 |
*For more details check out the original model at [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct).*
|
19 |
|
20 |
|
21 |
+
The Phi-3-Mini-128K-Instruct is a 3.8 billion-parameter, lightweight, state-of-the-art open model trained using the Phi-3 datasets. This dataset includes both synthetic data and filtered publicly available website data, with an emphasis on high-quality and reasoning-dense properties. The model belongs to the Phi-3 family with the Mini version in two variants 4K and 128K which is the context length (in tokens) that it can support.
|
22 |
|
23 |
After initial training, the model underwent a post-training process that involved supervised fine-tuning and direct preference optimization to enhance its ability to follow instructions and adhere to safety measures. When evaluated against benchmarks that test common sense, language understanding, mathematics, coding, long-term context, and logical reasoning, the Phi-3 Mini-128K-Instruct demonstrated robust and state-of-the-art performance among models with fewer than 13 billion parameters. Resources and Technical Documentation:
|
24 |
|
|
|
42 |
|
43 |
### License
|
44 |
|
45 |
+
The model is licensed under the [MIT license](https://huggingface.co/microsoft/phi-3-mini-128k/resolve/main/LICENSE).
|
|
|
|
|
|
|
|