File size: 26,474 Bytes
bb27a06
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
---
license: gemma
base_model: google/gemma-2-2b
tags:
- trl
- sft
- generated_from_trainer
model-index:
- name: collapse_gemma-2-2b_hs2_accumulate_iter18_sftsd1
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# collapse_gemma-2-2b_hs2_accumulate_iter18_sftsd1

This model is a fine-tuned version of [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 1.0987
- Num Input Tokens Seen: 92143336

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 8e-06
- train_batch_size: 8
- eval_batch_size: 16
- seed: 1
- gradient_accumulation_steps: 16
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant_with_warmup
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 1

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Input Tokens Seen |
|:-------------:|:------:|:----:|:---------------:|:-----------------:|
| No log        | 0      | 0    | 1.3909          | 0                 |
| 1.6987        | 0.0029 | 5    | 1.3904          | 270960            |
| 1.6486        | 0.0058 | 10   | 1.3832          | 543776            |
| 1.6449        | 0.0087 | 15   | 1.3606          | 806136            |
| 1.5808        | 0.0116 | 20   | 1.3291          | 1069472           |
| 1.5584        | 0.0144 | 25   | 1.2860          | 1329032           |
| 1.4797        | 0.0173 | 30   | 1.2449          | 1597912           |
| 1.3686        | 0.0202 | 35   | 1.2173          | 1866088           |
| 1.2231        | 0.0231 | 40   | 1.1920          | 2134728           |
| 1.0982        | 0.0260 | 45   | 1.1923          | 2409176           |
| 0.9703        | 0.0289 | 50   | 1.2209          | 2669112           |
| 0.9416        | 0.0318 | 55   | 1.2490          | 2934120           |
| 0.7829        | 0.0347 | 60   | 1.2681          | 3202328           |
| 0.7175        | 0.0375 | 65   | 1.2750          | 3468824           |
| 0.6113        | 0.0404 | 70   | 1.3062          | 3737368           |
| 0.4934        | 0.0433 | 75   | 1.3147          | 4000568           |
| 0.4708        | 0.0462 | 80   | 1.3241          | 4269560           |
| 0.3086        | 0.0491 | 85   | 1.2773          | 4527344           |
| 0.3285        | 0.0520 | 90   | 1.2721          | 4793952           |
| 0.3032        | 0.0549 | 95   | 1.2753          | 5062120           |
| 0.2329        | 0.0578 | 100  | 1.2260          | 5326728           |
| 0.3035        | 0.0606 | 105  | 1.2274          | 5600864           |
| 0.2459        | 0.0635 | 110  | 1.2213          | 5864096           |
| 0.2759        | 0.0664 | 115  | 1.2208          | 6122280           |
| 0.18          | 0.0693 | 120  | 1.2135          | 6389656           |
| 0.1301        | 0.0722 | 125  | 1.2084          | 6654192           |
| 0.2706        | 0.0751 | 130  | 1.1997          | 6926752           |
| 0.2556        | 0.0780 | 135  | 1.2031          | 7195712           |
| 0.2105        | 0.0809 | 140  | 1.2011          | 7464520           |
| 0.1426        | 0.0837 | 145  | 1.1993          | 7730008           |
| 0.1783        | 0.0866 | 150  | 1.2052          | 7998384           |
| 0.1648        | 0.0895 | 155  | 1.1888          | 8273232           |
| 0.1672        | 0.0924 | 160  | 1.1909          | 8546224           |
| 0.1124        | 0.0953 | 165  | 1.1949          | 8821008           |
| 0.1286        | 0.0982 | 170  | 1.1836          | 9091072           |
| 0.1845        | 0.1011 | 175  | 1.1917          | 9359184           |
| 0.2387        | 0.1040 | 180  | 1.1794          | 9619376           |
| 0.2107        | 0.1068 | 185  | 1.1842          | 9885592           |
| 0.1587        | 0.1097 | 190  | 1.1820          | 10152520          |
| 0.1815        | 0.1126 | 195  | 1.1793          | 10415520          |
| 0.1845        | 0.1155 | 200  | 1.1765          | 10682032          |
| 0.1766        | 0.1184 | 205  | 1.1769          | 10953560          |
| 0.1884        | 0.1213 | 210  | 1.1726          | 11223304          |
| 0.1853        | 0.1242 | 215  | 1.1755          | 11495576          |
| 0.184         | 0.1271 | 220  | 1.1777          | 11757464          |
| 0.1219        | 0.1299 | 225  | 1.1683          | 12015416          |
| 0.1437        | 0.1328 | 230  | 1.1761          | 12289472          |
| 0.1693        | 0.1357 | 235  | 1.1720          | 12558632          |
| 0.1442        | 0.1386 | 240  | 1.1686          | 12822488          |
| 0.2042        | 0.1415 | 245  | 1.1665          | 13079816          |
| 0.2118        | 0.1444 | 250  | 1.1705          | 13346176          |
| 0.1309        | 0.1473 | 255  | 1.1677          | 13611488          |
| 0.1784        | 0.1502 | 260  | 1.1683          | 13875176          |
| 0.1946        | 0.1530 | 265  | 1.1684          | 14141064          |
| 0.2299        | 0.1559 | 270  | 1.1637          | 14405528          |
| 0.1208        | 0.1588 | 275  | 1.1620          | 14674184          |
| 0.1641        | 0.1617 | 280  | 1.1586          | 14938784          |
| 0.1519        | 0.1646 | 285  | 1.1576          | 15202512          |
| 0.1524        | 0.1675 | 290  | 1.1634          | 15459472          |
| 0.1406        | 0.1704 | 295  | 1.1566          | 15724376          |
| 0.1566        | 0.1733 | 300  | 1.1578          | 15991128          |
| 0.1077        | 0.1761 | 305  | 1.1562          | 16258728          |
| 0.0978        | 0.1790 | 310  | 1.1572          | 16531472          |
| 0.1482        | 0.1819 | 315  | 1.1593          | 16795440          |
| 0.11          | 0.1848 | 320  | 1.1581          | 17054568          |
| 0.1224        | 0.1877 | 325  | 1.1565          | 17320832          |
| 0.1237        | 0.1906 | 330  | 1.1548          | 17589160          |
| 0.1225        | 0.1935 | 335  | 1.1549          | 17859592          |
| 0.1546        | 0.1964 | 340  | 1.1556          | 18122264          |
| 0.1408        | 0.1992 | 345  | 1.1552          | 18385856          |
| 0.121         | 0.2021 | 350  | 1.1556          | 18659936          |
| 0.2222        | 0.2050 | 355  | 1.1535          | 18918392          |
| 0.1528        | 0.2079 | 360  | 1.1541          | 19182728          |
| 0.1534        | 0.2108 | 365  | 1.1531          | 19450968          |
| 0.1442        | 0.2137 | 370  | 1.1496          | 19714280          |
| 0.1244        | 0.2166 | 375  | 1.1492          | 19977192          |
| 0.1912        | 0.2195 | 380  | 1.1534          | 20245192          |
| 0.1174        | 0.2224 | 385  | 1.1512          | 20509528          |
| 0.1046        | 0.2252 | 390  | 1.1502          | 20777056          |
| 0.1868        | 0.2281 | 395  | 1.1460          | 21041064          |
| 0.1649        | 0.2310 | 400  | 1.1449          | 21300688          |
| 0.1247        | 0.2339 | 405  | 1.1452          | 21571768          |
| 0.1122        | 0.2368 | 410  | 1.1434          | 21841096          |
| 0.2296        | 0.2397 | 415  | 1.1419          | 22107696          |
| 0.1551        | 0.2426 | 420  | 1.1422          | 22377296          |
| 0.1198        | 0.2455 | 425  | 1.1438          | 22647200          |
| 0.1214        | 0.2483 | 430  | 1.1441          | 22909288          |
| 0.1918        | 0.2512 | 435  | 1.1455          | 23176680          |
| 0.1422        | 0.2541 | 440  | 1.1450          | 23446384          |
| 0.1168        | 0.2570 | 445  | 1.1442          | 23711824          |
| 0.099         | 0.2599 | 450  | 1.1412          | 23974416          |
| 0.1084        | 0.2628 | 455  | 1.1436          | 24238504          |
| 0.1797        | 0.2657 | 460  | 1.1436          | 24504696          |
| 0.177         | 0.2686 | 465  | 1.1398          | 24765792          |
| 0.1445        | 0.2714 | 470  | 1.1427          | 25032552          |
| 0.1558        | 0.2743 | 475  | 1.1380          | 25295104          |
| 0.1002        | 0.2772 | 480  | 1.1363          | 25555720          |
| 0.1082        | 0.2801 | 485  | 1.1421          | 25817792          |
| 0.1059        | 0.2830 | 490  | 1.1393          | 26081592          |
| 0.1164        | 0.2859 | 495  | 1.1368          | 26347656          |
| 0.096         | 0.2888 | 500  | 1.1381          | 26617936          |
| 0.1255        | 0.2917 | 505  | 1.1374          | 26884744          |
| 0.1734        | 0.2945 | 510  | 1.1373          | 27149264          |
| 0.1357        | 0.2974 | 515  | 1.1365          | 27417000          |
| 0.1836        | 0.3003 | 520  | 1.1372          | 27682520          |
| 0.0934        | 0.3032 | 525  | 1.1400          | 27949544          |
| 0.0914        | 0.3061 | 530  | 1.1380          | 28216896          |
| 0.1157        | 0.3090 | 535  | 1.1353          | 28479296          |
| 0.146         | 0.3119 | 540  | 1.1351          | 28755216          |
| 0.1954        | 0.3148 | 545  | 1.1357          | 29019640          |
| 0.1166        | 0.3176 | 550  | 1.1334          | 29281320          |
| 0.1295        | 0.3205 | 555  | 1.1343          | 29546448          |
| 0.1361        | 0.3234 | 560  | 1.1355          | 29805376          |
| 0.1249        | 0.3263 | 565  | 1.1329          | 30074584          |
| 0.1307        | 0.3292 | 570  | 1.1340          | 30343192          |
| 0.1761        | 0.3321 | 575  | 1.1352          | 30600184          |
| 0.1241        | 0.3350 | 580  | 1.1304          | 30865784          |
| 0.1802        | 0.3379 | 585  | 1.1308          | 31131480          |
| 0.1077        | 0.3407 | 590  | 1.1331          | 31400416          |
| 0.2017        | 0.3436 | 595  | 1.1331          | 31672048          |
| 0.1348        | 0.3465 | 600  | 1.1299          | 31937752          |
| 0.1469        | 0.3494 | 605  | 1.1312          | 32203736          |
| 0.0765        | 0.3523 | 610  | 1.1303          | 32469208          |
| 0.1269        | 0.3552 | 615  | 1.1302          | 32734112          |
| 0.0929        | 0.3581 | 620  | 1.1305          | 33006400          |
| 0.2169        | 0.3610 | 625  | 1.1301          | 33270272          |
| 0.0898        | 0.3638 | 630  | 1.1272          | 33532832          |
| 0.1692        | 0.3667 | 635  | 1.1290          | 33797256          |
| 0.09          | 0.3696 | 640  | 1.1289          | 34063048          |
| 0.0708        | 0.3725 | 645  | 1.1301          | 34326024          |
| 0.1575        | 0.3754 | 650  | 1.1272          | 34597328          |
| 0.1042        | 0.3783 | 655  | 1.1269          | 34858432          |
| 0.1163        | 0.3812 | 660  | 1.1316          | 35121080          |
| 0.1444        | 0.3841 | 665  | 1.1375          | 35389184          |
| 0.1591        | 0.3869 | 670  | 1.1321          | 35649584          |
| 0.0957        | 0.3898 | 675  | 1.1245          | 35917320          |
| 0.2456        | 0.3927 | 680  | 1.1294          | 36187656          |
| 0.1111        | 0.3956 | 685  | 1.1298          | 36444688          |
| 0.103         | 0.3985 | 690  | 1.1280          | 36709440          |
| 0.0784        | 0.4014 | 695  | 1.1259          | 36972760          |
| 0.1514        | 0.4043 | 700  | 1.1284          | 37233720          |
| 0.1235        | 0.4072 | 705  | 1.1293          | 37497496          |
| 0.086         | 0.4100 | 710  | 1.1256          | 37762992          |
| 0.1205        | 0.4129 | 715  | 1.1255          | 38029864          |
| 0.0625        | 0.4158 | 720  | 1.1272          | 38296328          |
| 0.1199        | 0.4187 | 725  | 1.1257          | 38559600          |
| 0.1254        | 0.4216 | 730  | 1.1238          | 38833968          |
| 0.13          | 0.4245 | 735  | 1.1254          | 39100088          |
| 0.0957        | 0.4274 | 740  | 1.1274          | 39366176          |
| 0.1801        | 0.4303 | 745  | 1.1217          | 39636032          |
| 0.0944        | 0.4332 | 750  | 1.1207          | 39903584          |
| 0.1007        | 0.4360 | 755  | 1.1252          | 40163224          |
| 0.1033        | 0.4389 | 760  | 1.1256          | 40428144          |
| 0.1029        | 0.4418 | 765  | 1.1218          | 40697240          |
| 0.0746        | 0.4447 | 770  | 1.1230          | 40962568          |
| 0.1095        | 0.4476 | 775  | 1.1250          | 41228136          |
| 0.1302        | 0.4505 | 780  | 1.1244          | 41496296          |
| 0.1077        | 0.4534 | 785  | 1.1234          | 41760456          |
| 0.1226        | 0.4563 | 790  | 1.1204          | 42022912          |
| 0.1361        | 0.4591 | 795  | 1.1195          | 42291248          |
| 0.1083        | 0.4620 | 800  | 1.1202          | 42562552          |
| 0.1502        | 0.4649 | 805  | 1.1204          | 42833160          |
| 0.1147        | 0.4678 | 810  | 1.1204          | 43098648          |
| 0.1306        | 0.4707 | 815  | 1.1216          | 43360472          |
| 0.114         | 0.4736 | 820  | 1.1220          | 43628632          |
| 0.1           | 0.4765 | 825  | 1.1198          | 43889976          |
| 0.1245        | 0.4794 | 830  | 1.1207          | 44162040          |
| 0.1761        | 0.4822 | 835  | 1.1200          | 44429408          |
| 0.1565        | 0.4851 | 840  | 1.1190          | 44694248          |
| 0.1473        | 0.4880 | 845  | 1.1174          | 44967944          |
| 0.0811        | 0.4909 | 850  | 1.1188          | 45233848          |
| 0.0874        | 0.4938 | 855  | 1.1186          | 45501712          |
| 0.1277        | 0.4967 | 860  | 1.1189          | 45770224          |
| 0.1056        | 0.4996 | 865  | 1.1173          | 46026120          |
| 0.0927        | 0.5025 | 870  | 1.1164          | 46293248          |
| 0.1233        | 0.5053 | 875  | 1.1164          | 46559048          |
| 0.1055        | 0.5082 | 880  | 1.1181          | 46817752          |
| 0.132         | 0.5111 | 885  | 1.1189          | 47088288          |
| 0.108         | 0.5140 | 890  | 1.1168          | 47354792          |
| 0.1097        | 0.5169 | 895  | 1.1170          | 47621808          |
| 0.1805        | 0.5198 | 900  | 1.1151          | 47892168          |
| 0.1229        | 0.5227 | 905  | 1.1151          | 48164096          |
| 0.1484        | 0.5256 | 910  | 1.1181          | 48434336          |
| 0.1245        | 0.5284 | 915  | 1.1175          | 48701160          |
| 0.0801        | 0.5313 | 920  | 1.1155          | 48966920          |
| 0.0684        | 0.5342 | 925  | 1.1150          | 49231088          |
| 0.1012        | 0.5371 | 930  | 1.1172          | 49500872          |
| 0.0826        | 0.5400 | 935  | 1.1169          | 49764080          |
| 0.0547        | 0.5429 | 940  | 1.1156          | 50037824          |
| 0.1756        | 0.5458 | 945  | 1.1166          | 50313056          |
| 0.1313        | 0.5487 | 950  | 1.1165          | 50584128          |
| 0.1571        | 0.5515 | 955  | 1.1141          | 50847832          |
| 0.1404        | 0.5544 | 960  | 1.1148          | 51107992          |
| 0.1436        | 0.5573 | 965  | 1.1144          | 51370408          |
| 0.1767        | 0.5602 | 970  | 1.1130          | 51639728          |
| 0.15          | 0.5631 | 975  | 1.1121          | 51905288          |
| 0.1444        | 0.5660 | 980  | 1.1147          | 52176536          |
| 0.13          | 0.5689 | 985  | 1.1159          | 52449872          |
| 0.1294        | 0.5718 | 990  | 1.1149          | 52714680          |
| 0.1163        | 0.5746 | 995  | 1.1136          | 52980096          |
| 0.0975        | 0.5775 | 1000 | 1.1133          | 53242168          |
| 0.1348        | 0.5804 | 1005 | 1.1144          | 53515240          |
| 0.0872        | 0.5833 | 1010 | 1.1130          | 53776040          |
| 0.0634        | 0.5862 | 1015 | 1.1133          | 54042552          |
| 0.1704        | 0.5891 | 1020 | 1.1138          | 54309432          |
| 0.0965        | 0.5920 | 1025 | 1.1138          | 54576264          |
| 0.1           | 0.5949 | 1030 | 1.1143          | 54840264          |
| 0.1074        | 0.5977 | 1035 | 1.1140          | 55106192          |
| 0.101         | 0.6006 | 1040 | 1.1116          | 55370344          |
| 0.1473        | 0.6035 | 1045 | 1.1112          | 55637160          |
| 0.0814        | 0.6064 | 1050 | 1.1133          | 55906880          |
| 0.1764        | 0.6093 | 1055 | 1.1135          | 56178392          |
| 0.103         | 0.6122 | 1060 | 1.1120          | 56439152          |
| 0.1243        | 0.6151 | 1065 | 1.1120          | 56708416          |
| 0.1113        | 0.6180 | 1070 | 1.1122          | 56975080          |
| 0.1242        | 0.6208 | 1075 | 1.1118          | 57234312          |
| 0.0737        | 0.6237 | 1080 | 1.1114          | 57504112          |
| 0.1164        | 0.6266 | 1085 | 1.1145          | 57764192          |
| 0.1563        | 0.6295 | 1090 | 1.1125          | 58029744          |
| 0.144         | 0.6324 | 1095 | 1.1097          | 58296288          |
| 0.1292        | 0.6353 | 1100 | 1.1100          | 58559256          |
| 0.0958        | 0.6382 | 1105 | 1.1111          | 58816976          |
| 0.1067        | 0.6411 | 1110 | 1.1117          | 59088408          |
| 0.1166        | 0.6440 | 1115 | 1.1126          | 59349992          |
| 0.1541        | 0.6468 | 1120 | 1.1108          | 59608656          |
| 0.0775        | 0.6497 | 1125 | 1.1102          | 59877288          |
| 0.1546        | 0.6526 | 1130 | 1.1120          | 60144648          |
| 0.0741        | 0.6555 | 1135 | 1.1118          | 60414784          |
| 0.1158        | 0.6584 | 1140 | 1.1101          | 60687248          |
| 0.1345        | 0.6613 | 1145 | 1.1108          | 60956640          |
| 0.1763        | 0.6642 | 1150 | 1.1115          | 61222976          |
| 0.0611        | 0.6671 | 1155 | 1.1117          | 61489536          |
| 0.1453        | 0.6699 | 1160 | 1.1115          | 61762704          |
| 0.1826        | 0.6728 | 1165 | 1.1092          | 62028760          |
| 0.0834        | 0.6757 | 1170 | 1.1094          | 62298616          |
| 0.1709        | 0.6786 | 1175 | 1.1107          | 62568072          |
| 0.1787        | 0.6815 | 1180 | 1.1090          | 62832512          |
| 0.1068        | 0.6844 | 1185 | 1.1086          | 63094744          |
| 0.1228        | 0.6873 | 1190 | 1.1074          | 63363208          |
| 0.1137        | 0.6902 | 1195 | 1.1071          | 63643528          |
| 0.0934        | 0.6930 | 1200 | 1.1072          | 63911528          |
| 0.1905        | 0.6959 | 1205 | 1.1072          | 64172360          |
| 0.1285        | 0.6988 | 1210 | 1.1090          | 64439392          |
| 0.1405        | 0.7017 | 1215 | 1.1103          | 64711128          |
| 0.1031        | 0.7046 | 1220 | 1.1102          | 64974024          |
| 0.1651        | 0.7075 | 1225 | 1.1092          | 65234672          |
| 0.1112        | 0.7104 | 1230 | 1.1070          | 65493112          |
| 0.1175        | 0.7133 | 1235 | 1.1075          | 65758712          |
| 0.1216        | 0.7161 | 1240 | 1.1082          | 66024680          |
| 0.0749        | 0.7190 | 1245 | 1.1098          | 66295584          |
| 0.1513        | 0.7219 | 1250 | 1.1079          | 66559520          |
| 0.1151        | 0.7248 | 1255 | 1.1068          | 66834168          |
| 0.181         | 0.7277 | 1260 | 1.1075          | 67097544          |
| 0.1586        | 0.7306 | 1265 | 1.1087          | 67356608          |
| 0.0934        | 0.7335 | 1270 | 1.1081          | 67622152          |
| 0.0991        | 0.7364 | 1275 | 1.1070          | 67885504          |
| 0.1203        | 0.7392 | 1280 | 1.1060          | 68156088          |
| 0.1323        | 0.7421 | 1285 | 1.1049          | 68427920          |
| 0.1043        | 0.7450 | 1290 | 1.1056          | 68688672          |
| 0.1415        | 0.7479 | 1295 | 1.1070          | 68953512          |
| 0.1361        | 0.7508 | 1300 | 1.1058          | 69222744          |
| 0.1713        | 0.7537 | 1305 | 1.1041          | 69493152          |
| 0.1207        | 0.7566 | 1310 | 1.1047          | 69759064          |
| 0.123         | 0.7595 | 1315 | 1.1051          | 70028704          |
| 0.1134        | 0.7623 | 1320 | 1.1061          | 70292648          |
| 0.1002        | 0.7652 | 1325 | 1.1054          | 70559392          |
| 0.1196        | 0.7681 | 1330 | 1.1049          | 70828480          |
| 0.1276        | 0.7710 | 1335 | 1.1047          | 71101208          |
| 0.1287        | 0.7739 | 1340 | 1.1054          | 71367200          |
| 0.109         | 0.7768 | 1345 | 1.1039          | 71634640          |
| 0.1795        | 0.7797 | 1350 | 1.1032          | 71902800          |
| 0.1094        | 0.7826 | 1355 | 1.1032          | 72174800          |
| 0.125         | 0.7854 | 1360 | 1.1053          | 72432704          |
| 0.1531        | 0.7883 | 1365 | 1.1055          | 72700696          |
| 0.122         | 0.7912 | 1370 | 1.1034          | 72965800          |
| 0.0804        | 0.7941 | 1375 | 1.1032          | 73231184          |
| 0.146         | 0.7970 | 1380 | 1.1033          | 73498048          |
| 0.1349        | 0.7999 | 1385 | 1.1025          | 73761088          |
| 0.107         | 0.8028 | 1390 | 1.1037          | 74028624          |
| 0.0812        | 0.8057 | 1395 | 1.1038          | 74291768          |
| 0.1222        | 0.8085 | 1400 | 1.1045          | 74563544          |
| 0.1458        | 0.8114 | 1405 | 1.1054          | 74832776          |
| 0.1657        | 0.8143 | 1410 | 1.1023          | 75101176          |
| 0.1954        | 0.8172 | 1415 | 1.1017          | 75369232          |
| 0.0891        | 0.8201 | 1420 | 1.1022          | 75638216          |
| 0.0955        | 0.8230 | 1425 | 1.1041          | 75900240          |
| 0.1365        | 0.8259 | 1430 | 1.1035          | 76159624          |
| 0.1079        | 0.8288 | 1435 | 1.1004          | 76422720          |
| 0.0682        | 0.8316 | 1440 | 1.1013          | 76686592          |
| 0.0583        | 0.8345 | 1445 | 1.1029          | 76949056          |
| 0.1214        | 0.8374 | 1450 | 1.1024          | 77218632          |
| 0.1268        | 0.8403 | 1455 | 1.1006          | 77478560          |
| 0.1053        | 0.8432 | 1460 | 1.1008          | 77743768          |
| 0.108         | 0.8461 | 1465 | 1.1031          | 78017344          |
| 0.0866        | 0.8490 | 1470 | 1.1021          | 78276936          |
| 0.0885        | 0.8519 | 1475 | 1.1003          | 78544376          |
| 0.0623        | 0.8548 | 1480 | 1.1005          | 78808896          |
| 0.1158        | 0.8576 | 1485 | 1.1015          | 79078776          |
| 0.1327        | 0.8605 | 1490 | 1.1018          | 79345224          |
| 0.0456        | 0.8634 | 1495 | 1.1017          | 79606728          |
| 0.0962        | 0.8663 | 1500 | 1.1019          | 79872616          |
| 0.1048        | 0.8692 | 1505 | 1.1017          | 80139096          |
| 0.0817        | 0.8721 | 1510 | 1.1008          | 80403584          |
| 0.1074        | 0.8750 | 1515 | 1.1015          | 80670528          |
| 0.1072        | 0.8779 | 1520 | 1.1015          | 80938992          |
| 0.1117        | 0.8807 | 1525 | 1.1014          | 81204304          |
| 0.0757        | 0.8836 | 1530 | 1.1020          | 81466168          |
| 0.1819        | 0.8865 | 1535 | 1.1017          | 81736616          |
| 0.1645        | 0.8894 | 1540 | 1.0998          | 82000800          |
| 0.1252        | 0.8923 | 1545 | 1.0981          | 82269024          |
| 0.1398        | 0.8952 | 1550 | 1.0987          | 82540928          |
| 0.1036        | 0.8981 | 1555 | 1.1008          | 82807760          |
| 0.1573        | 0.9010 | 1560 | 1.1002          | 83066216          |
| 0.1581        | 0.9038 | 1565 | 1.0993          | 83339256          |
| 0.0878        | 0.9067 | 1570 | 1.0996          | 83604536          |
| 0.092         | 0.9096 | 1575 | 1.1001          | 83865496          |
| 0.1575        | 0.9125 | 1580 | 1.0991          | 84126424          |
| 0.08          | 0.9154 | 1585 | 1.0989          | 84391176          |
| 0.0513        | 0.9183 | 1590 | 1.1008          | 84660208          |
| 0.1259        | 0.9212 | 1595 | 1.1026          | 84930128          |
| 0.15          | 0.9241 | 1600 | 1.1019          | 85195328          |
| 0.0984        | 0.9269 | 1605 | 1.0990          | 85453816          |
| 0.1439        | 0.9298 | 1610 | 1.0993          | 85723592          |
| 0.1366        | 0.9327 | 1615 | 1.0992          | 85987408          |
| 0.1144        | 0.9356 | 1620 | 1.0992          | 86250040          |
| 0.1167        | 0.9385 | 1625 | 1.0995          | 86516408          |
| 0.1447        | 0.9414 | 1630 | 1.1004          | 86779568          |
| 0.1233        | 0.9443 | 1635 | 1.0990          | 87049088          |
| 0.1037        | 0.9472 | 1640 | 1.0979          | 87317264          |
| 0.1341        | 0.9500 | 1645 | 1.0985          | 87581184          |
| 0.1036        | 0.9529 | 1650 | 1.0992          | 87846968          |
| 0.1435        | 0.9558 | 1655 | 1.0976          | 88100656          |
| 0.1207        | 0.9587 | 1660 | 1.0968          | 88363744          |
| 0.1299        | 0.9616 | 1665 | 1.0978          | 88629744          |
| 0.1279        | 0.9645 | 1670 | 1.0990          | 88894456          |
| 0.1122        | 0.9674 | 1675 | 1.0988          | 89162032          |
| 0.1317        | 0.9703 | 1680 | 1.0972          | 89431088          |
| 0.1591        | 0.9731 | 1685 | 1.0972          | 89703328          |
| 0.1128        | 0.9760 | 1690 | 1.0987          | 89967664          |
| 0.1896        | 0.9789 | 1695 | 1.0985          | 90232816          |
| 0.0941        | 0.9818 | 1700 | 1.0973          | 90500560          |
| 0.1163        | 0.9847 | 1705 | 1.0960          | 90769384          |
| 0.0629        | 0.9876 | 1710 | 1.0973          | 91037864          |
| 0.1257        | 0.9905 | 1715 | 1.0987          | 91299848          |
| 0.0984        | 0.9934 | 1720 | 1.0984          | 91567824          |
| 0.086         | 0.9962 | 1725 | 1.0988          | 91834832          |
| 0.1386        | 0.9991 | 1730 | 1.0985          | 92091840          |


### Framework versions

- Transformers 4.44.0
- Pytorch 2.4.0+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1