ArkaAbacus
commited on
Commit
•
1dbbf47
1
Parent(s):
42ac13a
Update README.md
Browse files
README.md
CHANGED
@@ -1,10 +1,16 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
3 |
---
|
4 |
|
5 |
-
Trained on the
|
6 |
|
7 |
-
Instruction tuned
|
8 |
|
9 |
-
|
10 |
-
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
datasets:
|
4 |
+
- abacusai/MetaMathFewshot
|
5 |
+
- shahules786/orca-chat
|
6 |
+
- anon8231489123/ShareGPT_Vicuna_unfiltered
|
7 |
---
|
8 |
|
9 |
+
Trained on the MetamathFewshot (https://huggingface.co/datasets/abacusai/MetaMathFewshot) dataset from base Mistral, as well as the Vicuna (https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered) dataset and the OrcaChat (https://huggingface.co/datasets/shahules786/orca-chat) dataset.
|
10 |
|
11 |
+
Instruction tuned with the following parameters:
|
12 |
|
13 |
+
- LORA, Rank 8, Alpha 16, Dropout 0.05, all modules (QKV and MLP)
|
14 |
+
- 3 epochs
|
15 |
+
- Micro Batch Size 32 over 4xH100, gradient accumulation steps = 1
|
16 |
+
- AdamW with learning rate 5e-5
|