manojpreveen
commited on
Commit
•
754acd6
1
Parent(s):
8d786b3
Update README.md
Browse files
README.md
CHANGED
@@ -11,6 +11,11 @@ First Version of Fine Tuned Bloomz-7B1 model on CoT dataset from Flan Data Colle
|
|
11 |
|
12 |
* Epochs: 8
|
13 |
* Batch Size : 5 instantaneous per device x 2 gradient accumulation steps x 8 gpus = 80
|
|
|
|
|
|
|
|
|
|
|
14 |
* Machine : 8xA100 80GB
|
15 |
|
16 |
**Dataset Details :**
|
|
|
11 |
|
12 |
* Epochs: 8
|
13 |
* Batch Size : 5 instantaneous per device x 2 gradient accumulation steps x 8 gpus = 80
|
14 |
+
* Max Length : 1024
|
15 |
+
* Weight Decay : 0
|
16 |
+
* Learning Rate : 5e-5
|
17 |
+
* Learning Rate Scheduler Type : Linear
|
18 |
+
* Number of warmup steps : 0
|
19 |
* Machine : 8xA100 80GB
|
20 |
|
21 |
**Dataset Details :**
|