Update README.md
Browse files
README.md
CHANGED
@@ -2,9 +2,32 @@
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
5 |
-
|
6 |
-
Hyperparameters: learning rate: 2e-5, weight decay: 0.01, per_device_train_batch_size: 16, per_device_eval_batch_size: 16, gradient_accumulation_steps:1, eval steps: 5000, max_length: 128, num_epochs: 3.
|
7 |
-
Dataset version: “craffel/tasky_or_not”, “10xp3_10xc4”, “15f88c8”
|
8 |
-
Checkpoint: 10000 steps
|
9 |
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
5 |
+
**Hyperparameters:**
|
|
|
|
|
|
|
6 |
|
7 |
+
- learning rate: 2e-5
|
8 |
+
- weight decay: 0.01
|
9 |
+
- per_device_train_batch_size: 16
|
10 |
+
- per_device_eval_batch_size: 16
|
11 |
+
- gradient_accumulation_steps:1
|
12 |
+
- eval steps: 5000
|
13 |
+
- max_length: 128
|
14 |
+
- num_epochs: 3
|
15 |
+
|
16 |
+
**Dataset version:**
|
17 |
+
- “craffel/tasky_or_not”, “10xp3_10xc4”, “15f88c8”
|
18 |
+
|
19 |
+
**Checkpoint:**
|
20 |
+
|
21 |
+
- 10000 steps
|
22 |
+
|
23 |
+
**Results on Validation set:**
|
24 |
+
|
25 |
+
| Step | Training Loss | Validation Loss | Accuracy | Precision | Recall | F1 |
|
26 |
+
|-------|---------------|-----------------|----------|-----------|----------|----------|
|
27 |
+
| 5000 | 0.036400 | 0.266518 | 0.926913 | 0.999662 | 0.916934 | 0.956513 |
|
28 |
+
| 10000 | 0.022500 | 0.222881 | 0.952443 | 0.999494 | 0.946227 | 0.972132 |
|
29 |
+
| 15000 | 0.016600 | 0.634102 | 0.882638 | 0.999789 | 0.866301 | 0.928270 |
|
30 |
+
| 20000 | 0.011300 | 1.138026 | 0.849013 | 0.999796 | 0.827928 | 0.905781 |
|
31 |
+
| 25000 | 0.010300 | 0.623522 | 0.895619 | 0.999728 | 0.881166 | 0.936710 |
|
32 |
+
| 30000 | 0.006300 | 0.776632 | 0.879492 | 0.999804 | 0.862697 | 0.926204 |
|
33 |
+
| 35000 | 0.000500 | 0.704599 | 0.899149 | 0.999698 | 0.885220 | 0.938982 |
|