kanishka commited on
Commit
1c2c453
1 Parent(s): c842b77

End of training

Browse files
README.md CHANGED
@@ -1,11 +1,23 @@
1
  ---
2
  tags:
3
  - generated_from_trainer
 
 
4
  metrics:
5
  - accuracy
6
  model-index:
7
  - name: opt-babylm2-subset-default-3e-4
8
- results: []
 
 
 
 
 
 
 
 
 
 
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -13,7 +25,7 @@ should probably proofread and complete it, then remove this comment. -->
13
 
14
  # opt-babylm2-subset-default-3e-4
15
 
16
- This model was trained from scratch on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
  - Loss: 2.3776
19
  - Accuracy: 0.5327
 
1
  ---
2
  tags:
3
  - generated_from_trainer
4
+ datasets:
5
+ - kanishka/babylm2-subset
6
  metrics:
7
  - accuracy
8
  model-index:
9
  - name: opt-babylm2-subset-default-3e-4
10
+ results:
11
+ - task:
12
+ name: Causal Language Modeling
13
+ type: text-generation
14
+ dataset:
15
+ name: kanishka/babylm2-subset
16
+ type: kanishka/babylm2-subset
17
+ metrics:
18
+ - name: Accuracy
19
+ type: accuracy
20
+ value: 0.5327396962492645
21
  ---
22
 
23
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
25
 
26
  # opt-babylm2-subset-default-3e-4
27
 
28
+ This model was trained from scratch on the kanishka/babylm2-subset dataset.
29
  It achieves the following results on the evaluation set:
30
  - Loss: 2.3776
31
  - Accuracy: 0.5327
all_results.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 10.0,
3
+ "eval_accuracy": 0.5327396962492645,
4
+ "eval_loss": 2.377573251724243,
5
+ "eval_runtime": 124.4661,
6
+ "eval_samples": 46951,
7
+ "eval_samples_per_second": 377.219,
8
+ "eval_steps_per_second": 5.897,
9
+ "perplexity": 10.77871387443153,
10
+ "total_flos": 5.9695438005504e+17,
11
+ "train_loss": 2.237272139724321,
12
+ "train_runtime": 31036.7572,
13
+ "train_samples": 453383,
14
+ "train_samples_per_second": 146.079,
15
+ "train_steps_per_second": 4.565
16
+ }
eval_results.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 10.0,
3
+ "eval_accuracy": 0.5327396962492645,
4
+ "eval_loss": 2.377573251724243,
5
+ "eval_runtime": 124.4661,
6
+ "eval_samples": 46951,
7
+ "eval_samples_per_second": 377.219,
8
+ "eval_steps_per_second": 5.897,
9
+ "perplexity": 10.77871387443153
10
+ }
runs/Jul24_22-53-38_phyl-ling-p01.la.utexas.edu/events.out.tfevents.1721911056.phyl-ling-p01.la.utexas.edu.132712.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f902d4566853072bec641ec0a18171a081d044cfdc6327673cbc1ce69376a2a9
3
+ size 417
train_results.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 10.0,
3
+ "total_flos": 5.9695438005504e+17,
4
+ "train_loss": 2.237272139724321,
5
+ "train_runtime": 31036.7572,
6
+ "train_samples": 453383,
7
+ "train_samples_per_second": 146.079,
8
+ "train_steps_per_second": 4.565
9
+ }
trainer_state.json ADDED
@@ -0,0 +1,1119 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 10.0,
5
+ "eval_steps": 500,
6
+ "global_step": 141690,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 0.07057661091114405,
13
+ "grad_norm": 0.8544681072235107,
14
+ "learning_rate": 9.375e-06,
15
+ "loss": 5.8075,
16
+ "step": 1000
17
+ },
18
+ {
19
+ "epoch": 0.1411532218222881,
20
+ "grad_norm": 0.9348427057266235,
21
+ "learning_rate": 1.875e-05,
22
+ "loss": 3.9507,
23
+ "step": 2000
24
+ },
25
+ {
26
+ "epoch": 0.21172983273343213,
27
+ "grad_norm": 0.9987612366676331,
28
+ "learning_rate": 2.8125e-05,
29
+ "loss": 3.6154,
30
+ "step": 3000
31
+ },
32
+ {
33
+ "epoch": 0.2823064436445762,
34
+ "grad_norm": 0.9734600782394409,
35
+ "learning_rate": 3.75e-05,
36
+ "loss": 3.444,
37
+ "step": 4000
38
+ },
39
+ {
40
+ "epoch": 0.3528830545557202,
41
+ "grad_norm": 0.9963154196739197,
42
+ "learning_rate": 4.6874999999999994e-05,
43
+ "loss": 3.2983,
44
+ "step": 5000
45
+ },
46
+ {
47
+ "epoch": 0.42345966546686425,
48
+ "grad_norm": 0.8833392858505249,
49
+ "learning_rate": 5.625e-05,
50
+ "loss": 3.1806,
51
+ "step": 6000
52
+ },
53
+ {
54
+ "epoch": 0.49403627637800834,
55
+ "grad_norm": 0.8395134806632996,
56
+ "learning_rate": 6.5625e-05,
57
+ "loss": 3.0762,
58
+ "step": 7000
59
+ },
60
+ {
61
+ "epoch": 0.5646128872891524,
62
+ "grad_norm": 0.9056613445281982,
63
+ "learning_rate": 7.5e-05,
64
+ "loss": 3.0103,
65
+ "step": 8000
66
+ },
67
+ {
68
+ "epoch": 0.6351894982002965,
69
+ "grad_norm": 0.8697331547737122,
70
+ "learning_rate": 8.437499999999999e-05,
71
+ "loss": 2.9146,
72
+ "step": 9000
73
+ },
74
+ {
75
+ "epoch": 0.7057661091114404,
76
+ "grad_norm": 0.7743774056434631,
77
+ "learning_rate": 9.374999999999999e-05,
78
+ "loss": 2.8472,
79
+ "step": 10000
80
+ },
81
+ {
82
+ "epoch": 0.7763427200225845,
83
+ "grad_norm": 0.769959568977356,
84
+ "learning_rate": 0.00010312499999999999,
85
+ "loss": 2.8093,
86
+ "step": 11000
87
+ },
88
+ {
89
+ "epoch": 0.8469193309337285,
90
+ "grad_norm": 0.7926977872848511,
91
+ "learning_rate": 0.0001125,
92
+ "loss": 2.7498,
93
+ "step": 12000
94
+ },
95
+ {
96
+ "epoch": 0.9174959418448726,
97
+ "grad_norm": 0.7256486415863037,
98
+ "learning_rate": 0.000121865625,
99
+ "loss": 2.714,
100
+ "step": 13000
101
+ },
102
+ {
103
+ "epoch": 0.9880725527560167,
104
+ "grad_norm": 0.7287374138832092,
105
+ "learning_rate": 0.000131240625,
106
+ "loss": 2.6575,
107
+ "step": 14000
108
+ },
109
+ {
110
+ "epoch": 1.0,
111
+ "eval_accuracy": 0.4749594473317534,
112
+ "eval_loss": 2.8570854663848877,
113
+ "eval_runtime": 123.5266,
114
+ "eval_samples_per_second": 380.088,
115
+ "eval_steps_per_second": 5.942,
116
+ "step": 14169
117
+ },
118
+ {
119
+ "epoch": 1.0586491636671607,
120
+ "grad_norm": 0.7397704124450684,
121
+ "learning_rate": 0.00014060625,
122
+ "loss": 2.6238,
123
+ "step": 15000
124
+ },
125
+ {
126
+ "epoch": 1.1292257745783048,
127
+ "grad_norm": 0.7099004983901978,
128
+ "learning_rate": 0.000149971875,
129
+ "loss": 2.5914,
130
+ "step": 16000
131
+ },
132
+ {
133
+ "epoch": 1.1998023854894488,
134
+ "grad_norm": 0.6785891056060791,
135
+ "learning_rate": 0.000159346875,
136
+ "loss": 2.5765,
137
+ "step": 17000
138
+ },
139
+ {
140
+ "epoch": 1.2703789964005927,
141
+ "grad_norm": 0.6459276080131531,
142
+ "learning_rate": 0.000168703125,
143
+ "loss": 2.5484,
144
+ "step": 18000
145
+ },
146
+ {
147
+ "epoch": 1.340955607311737,
148
+ "grad_norm": 0.6296180486679077,
149
+ "learning_rate": 0.000178078125,
150
+ "loss": 2.5329,
151
+ "step": 19000
152
+ },
153
+ {
154
+ "epoch": 1.4115322182228809,
155
+ "grad_norm": 0.648757815361023,
156
+ "learning_rate": 0.00018745312499999998,
157
+ "loss": 2.5078,
158
+ "step": 20000
159
+ },
160
+ {
161
+ "epoch": 1.482108829134025,
162
+ "grad_norm": 0.6126940250396729,
163
+ "learning_rate": 0.00019681874999999998,
164
+ "loss": 2.5066,
165
+ "step": 21000
166
+ },
167
+ {
168
+ "epoch": 1.552685440045169,
169
+ "grad_norm": 0.5499350428581238,
170
+ "learning_rate": 0.00020618437499999995,
171
+ "loss": 2.4882,
172
+ "step": 22000
173
+ },
174
+ {
175
+ "epoch": 1.623262050956313,
176
+ "grad_norm": 0.7012745141983032,
177
+ "learning_rate": 0.00021555937499999998,
178
+ "loss": 2.4746,
179
+ "step": 23000
180
+ },
181
+ {
182
+ "epoch": 1.6938386618674572,
183
+ "grad_norm": 0.563831627368927,
184
+ "learning_rate": 0.00022493437499999998,
185
+ "loss": 2.4607,
186
+ "step": 24000
187
+ },
188
+ {
189
+ "epoch": 1.764415272778601,
190
+ "grad_norm": 0.4928041696548462,
191
+ "learning_rate": 0.00023430937499999997,
192
+ "loss": 2.4454,
193
+ "step": 25000
194
+ },
195
+ {
196
+ "epoch": 1.8349918836897452,
197
+ "grad_norm": 0.5389479398727417,
198
+ "learning_rate": 0.00024367499999999997,
199
+ "loss": 2.4429,
200
+ "step": 26000
201
+ },
202
+ {
203
+ "epoch": 1.9055684946008893,
204
+ "grad_norm": 0.549089252948761,
205
+ "learning_rate": 0.000253040625,
206
+ "loss": 2.4239,
207
+ "step": 27000
208
+ },
209
+ {
210
+ "epoch": 1.9761451055120332,
211
+ "grad_norm": 0.5027530193328857,
212
+ "learning_rate": 0.000262415625,
213
+ "loss": 2.4179,
214
+ "step": 28000
215
+ },
216
+ {
217
+ "epoch": 2.0,
218
+ "eval_accuracy": 0.4989820084802377,
219
+ "eval_loss": 2.6257801055908203,
220
+ "eval_runtime": 124.6794,
221
+ "eval_samples_per_second": 376.574,
222
+ "eval_steps_per_second": 5.887,
223
+ "step": 28338
224
+ },
225
+ {
226
+ "epoch": 2.0467217164231775,
227
+ "grad_norm": 0.4511743485927582,
228
+ "learning_rate": 0.000271790625,
229
+ "loss": 2.3885,
230
+ "step": 29000
231
+ },
232
+ {
233
+ "epoch": 2.1172983273343213,
234
+ "grad_norm": 0.4871021807193756,
235
+ "learning_rate": 0.00028115624999999994,
236
+ "loss": 2.378,
237
+ "step": 30000
238
+ },
239
+ {
240
+ "epoch": 2.1878749382454656,
241
+ "grad_norm": 0.4743562638759613,
242
+ "learning_rate": 0.00029053124999999994,
243
+ "loss": 2.3721,
244
+ "step": 31000
245
+ },
246
+ {
247
+ "epoch": 2.2584515491566095,
248
+ "grad_norm": 0.4668987989425659,
249
+ "learning_rate": 0.00029990624999999993,
250
+ "loss": 2.3584,
251
+ "step": 32000
252
+ },
253
+ {
254
+ "epoch": 2.3290281600677534,
255
+ "grad_norm": 0.4822200834751129,
256
+ "learning_rate": 0.0002972951043850852,
257
+ "loss": 2.3592,
258
+ "step": 33000
259
+ },
260
+ {
261
+ "epoch": 2.3996047709788977,
262
+ "grad_norm": 0.43820998072624207,
263
+ "learning_rate": 0.0002945601239857781,
264
+ "loss": 2.3459,
265
+ "step": 34000
266
+ },
267
+ {
268
+ "epoch": 2.4701813818900415,
269
+ "grad_norm": 0.4586232602596283,
270
+ "learning_rate": 0.0002918251435864709,
271
+ "loss": 2.3499,
272
+ "step": 35000
273
+ },
274
+ {
275
+ "epoch": 2.5407579928011854,
276
+ "grad_norm": 0.4139242172241211,
277
+ "learning_rate": 0.0002890928981675631,
278
+ "loss": 2.3302,
279
+ "step": 36000
280
+ },
281
+ {
282
+ "epoch": 2.6113346037123297,
283
+ "grad_norm": 0.41075077652931213,
284
+ "learning_rate": 0.000286357917768256,
285
+ "loss": 2.3236,
286
+ "step": 37000
287
+ },
288
+ {
289
+ "epoch": 2.681911214623474,
290
+ "grad_norm": 0.4315313994884491,
291
+ "learning_rate": 0.00028362567234934815,
292
+ "loss": 2.3239,
293
+ "step": 38000
294
+ },
295
+ {
296
+ "epoch": 2.752487825534618,
297
+ "grad_norm": 0.39671897888183594,
298
+ "learning_rate": 0.000280890691950041,
299
+ "loss": 2.3087,
300
+ "step": 39000
301
+ },
302
+ {
303
+ "epoch": 2.8230644364457618,
304
+ "grad_norm": 0.4033380150794983,
305
+ "learning_rate": 0.00027815844653113317,
306
+ "loss": 2.2952,
307
+ "step": 40000
308
+ },
309
+ {
310
+ "epoch": 2.893641047356906,
311
+ "grad_norm": 0.36479175090789795,
312
+ "learning_rate": 0.000275423466131826,
313
+ "loss": 2.3049,
314
+ "step": 41000
315
+ },
316
+ {
317
+ "epoch": 2.96421765826805,
318
+ "grad_norm": 0.3533291220664978,
319
+ "learning_rate": 0.00027268848573251887,
320
+ "loss": 2.286,
321
+ "step": 42000
322
+ },
323
+ {
324
+ "epoch": 3.0,
325
+ "eval_accuracy": 0.5122970740171453,
326
+ "eval_loss": 2.5088469982147217,
327
+ "eval_runtime": 124.5088,
328
+ "eval_samples_per_second": 377.09,
329
+ "eval_steps_per_second": 5.895,
330
+ "step": 42507
331
+ },
332
+ {
333
+ "epoch": 3.034794269179194,
334
+ "grad_norm": 0.37031927704811096,
335
+ "learning_rate": 0.00026995350533321177,
336
+ "loss": 2.2529,
337
+ "step": 43000
338
+ },
339
+ {
340
+ "epoch": 3.105370880090338,
341
+ "grad_norm": 0.39132222533226013,
342
+ "learning_rate": 0.0002672185249339046,
343
+ "loss": 2.2252,
344
+ "step": 44000
345
+ },
346
+ {
347
+ "epoch": 3.175947491001482,
348
+ "grad_norm": 0.36771416664123535,
349
+ "learning_rate": 0.0002644835445345975,
350
+ "loss": 2.2389,
351
+ "step": 45000
352
+ },
353
+ {
354
+ "epoch": 3.2465241019126263,
355
+ "grad_norm": 0.40541785955429077,
356
+ "learning_rate": 0.00026175129911568964,
357
+ "loss": 2.2224,
358
+ "step": 46000
359
+ },
360
+ {
361
+ "epoch": 3.31710071282377,
362
+ "grad_norm": 0.4325103461742401,
363
+ "learning_rate": 0.0002590190536967818,
364
+ "loss": 2.2286,
365
+ "step": 47000
366
+ },
367
+ {
368
+ "epoch": 3.3876773237349145,
369
+ "grad_norm": 0.3990887403488159,
370
+ "learning_rate": 0.00025628407329747466,
371
+ "loss": 2.216,
372
+ "step": 48000
373
+ },
374
+ {
375
+ "epoch": 3.4582539346460583,
376
+ "grad_norm": 0.36805373430252075,
377
+ "learning_rate": 0.00025354909289816756,
378
+ "loss": 2.2209,
379
+ "step": 49000
380
+ },
381
+ {
382
+ "epoch": 3.528830545557202,
383
+ "grad_norm": 0.357721209526062,
384
+ "learning_rate": 0.0002508141124988604,
385
+ "loss": 2.2218,
386
+ "step": 50000
387
+ },
388
+ {
389
+ "epoch": 3.5994071564683465,
390
+ "grad_norm": 0.3424566984176636,
391
+ "learning_rate": 0.00024808460206035186,
392
+ "loss": 2.2159,
393
+ "step": 51000
394
+ },
395
+ {
396
+ "epoch": 3.6699837673794904,
397
+ "grad_norm": 0.35358792543411255,
398
+ "learning_rate": 0.00024534962166104476,
399
+ "loss": 2.2085,
400
+ "step": 52000
401
+ },
402
+ {
403
+ "epoch": 3.7405603782906347,
404
+ "grad_norm": 0.369150310754776,
405
+ "learning_rate": 0.0002426146412617376,
406
+ "loss": 2.2033,
407
+ "step": 53000
408
+ },
409
+ {
410
+ "epoch": 3.8111369892017786,
411
+ "grad_norm": 0.3341532051563263,
412
+ "learning_rate": 0.00023987966086243048,
413
+ "loss": 2.2017,
414
+ "step": 54000
415
+ },
416
+ {
417
+ "epoch": 3.8817136001129224,
418
+ "grad_norm": 0.40089789032936096,
419
+ "learning_rate": 0.00023714741544352263,
420
+ "loss": 2.2015,
421
+ "step": 55000
422
+ },
423
+ {
424
+ "epoch": 3.9522902110240667,
425
+ "grad_norm": 0.35854268074035645,
426
+ "learning_rate": 0.0002344124350442155,
427
+ "loss": 2.2124,
428
+ "step": 56000
429
+ },
430
+ {
431
+ "epoch": 4.0,
432
+ "eval_accuracy": 0.5203132962446899,
433
+ "eval_loss": 2.444835662841797,
434
+ "eval_runtime": 124.4717,
435
+ "eval_samples_per_second": 377.202,
436
+ "eval_steps_per_second": 5.897,
437
+ "step": 56676
438
+ },
439
+ {
440
+ "epoch": 4.022866821935211,
441
+ "grad_norm": 0.3867768347263336,
442
+ "learning_rate": 0.00023167745464490835,
443
+ "loss": 2.1839,
444
+ "step": 57000
445
+ },
446
+ {
447
+ "epoch": 4.093443432846355,
448
+ "grad_norm": 0.3627295196056366,
449
+ "learning_rate": 0.00022894520922600055,
450
+ "loss": 2.1384,
451
+ "step": 58000
452
+ },
453
+ {
454
+ "epoch": 4.164020043757499,
455
+ "grad_norm": 0.35046979784965515,
456
+ "learning_rate": 0.00022621022882669337,
457
+ "loss": 2.1439,
458
+ "step": 59000
459
+ },
460
+ {
461
+ "epoch": 4.234596654668643,
462
+ "grad_norm": 0.3317088186740875,
463
+ "learning_rate": 0.00022347524842738624,
464
+ "loss": 2.1446,
465
+ "step": 60000
466
+ },
467
+ {
468
+ "epoch": 4.3051732655797865,
469
+ "grad_norm": 0.37563446164131165,
470
+ "learning_rate": 0.00022074300300847845,
471
+ "loss": 2.1408,
472
+ "step": 61000
473
+ },
474
+ {
475
+ "epoch": 4.375749876490931,
476
+ "grad_norm": 0.36360374093055725,
477
+ "learning_rate": 0.00021800802260917127,
478
+ "loss": 2.1447,
479
+ "step": 62000
480
+ },
481
+ {
482
+ "epoch": 4.446326487402075,
483
+ "grad_norm": 0.35474300384521484,
484
+ "learning_rate": 0.00021527577719026347,
485
+ "loss": 2.1478,
486
+ "step": 63000
487
+ },
488
+ {
489
+ "epoch": 4.516903098313219,
490
+ "grad_norm": 0.38771218061447144,
491
+ "learning_rate": 0.00021254079679095632,
492
+ "loss": 2.136,
493
+ "step": 64000
494
+ },
495
+ {
496
+ "epoch": 4.587479709224363,
497
+ "grad_norm": 0.3860458433628082,
498
+ "learning_rate": 0.0002098085513720485,
499
+ "loss": 2.1379,
500
+ "step": 65000
501
+ },
502
+ {
503
+ "epoch": 4.658056320135507,
504
+ "grad_norm": 0.3763484060764313,
505
+ "learning_rate": 0.00020707357097274134,
506
+ "loss": 2.1344,
507
+ "step": 66000
508
+ },
509
+ {
510
+ "epoch": 4.7286329310466515,
511
+ "grad_norm": 0.37907466292381287,
512
+ "learning_rate": 0.0002043413255538335,
513
+ "loss": 2.1426,
514
+ "step": 67000
515
+ },
516
+ {
517
+ "epoch": 4.799209541957795,
518
+ "grad_norm": 0.34690865874290466,
519
+ "learning_rate": 0.00020160634515452636,
520
+ "loss": 2.1208,
521
+ "step": 68000
522
+ },
523
+ {
524
+ "epoch": 4.869786152868939,
525
+ "grad_norm": 0.36183568835258484,
526
+ "learning_rate": 0.00019887409973561856,
527
+ "loss": 2.1242,
528
+ "step": 69000
529
+ },
530
+ {
531
+ "epoch": 4.940362763780083,
532
+ "grad_norm": 0.35947293043136597,
533
+ "learning_rate": 0.00019613911933631138,
534
+ "loss": 2.1307,
535
+ "step": 70000
536
+ },
537
+ {
538
+ "epoch": 5.0,
539
+ "eval_accuracy": 0.5251367702083976,
540
+ "eval_loss": 2.4099230766296387,
541
+ "eval_runtime": 124.5048,
542
+ "eval_samples_per_second": 377.102,
543
+ "eval_steps_per_second": 5.895,
544
+ "step": 70845
545
+ },
546
+ {
547
+ "epoch": 5.010939374691227,
548
+ "grad_norm": 0.3575204312801361,
549
+ "learning_rate": 0.00019340687391740358,
550
+ "loss": 2.1265,
551
+ "step": 71000
552
+ },
553
+ {
554
+ "epoch": 5.081515985602372,
555
+ "grad_norm": 0.38784071803092957,
556
+ "learning_rate": 0.00019067189351809646,
557
+ "loss": 2.064,
558
+ "step": 72000
559
+ },
560
+ {
561
+ "epoch": 5.152092596513516,
562
+ "grad_norm": 0.34589263796806335,
563
+ "learning_rate": 0.0001879396480991886,
564
+ "loss": 2.0754,
565
+ "step": 73000
566
+ },
567
+ {
568
+ "epoch": 5.2226692074246595,
569
+ "grad_norm": 0.3403594195842743,
570
+ "learning_rate": 0.00018520466769988148,
571
+ "loss": 2.073,
572
+ "step": 74000
573
+ },
574
+ {
575
+ "epoch": 5.293245818335803,
576
+ "grad_norm": 0.38706710934638977,
577
+ "learning_rate": 0.00018246968730057433,
578
+ "loss": 2.0812,
579
+ "step": 75000
580
+ },
581
+ {
582
+ "epoch": 5.363822429246947,
583
+ "grad_norm": 0.39309191703796387,
584
+ "learning_rate": 0.0001797347069012672,
585
+ "loss": 2.0915,
586
+ "step": 76000
587
+ },
588
+ {
589
+ "epoch": 5.434399040158092,
590
+ "grad_norm": 0.37432822585105896,
591
+ "learning_rate": 0.00017700246148235935,
592
+ "loss": 2.0755,
593
+ "step": 77000
594
+ },
595
+ {
596
+ "epoch": 5.504975651069236,
597
+ "grad_norm": 0.3538018465042114,
598
+ "learning_rate": 0.00017426748108305222,
599
+ "loss": 2.0772,
600
+ "step": 78000
601
+ },
602
+ {
603
+ "epoch": 5.57555226198038,
604
+ "grad_norm": 0.3601301610469818,
605
+ "learning_rate": 0.00017153523566414437,
606
+ "loss": 2.085,
607
+ "step": 79000
608
+ },
609
+ {
610
+ "epoch": 5.6461288728915235,
611
+ "grad_norm": 0.3469400703907013,
612
+ "learning_rate": 0.00016880299024523657,
613
+ "loss": 2.078,
614
+ "step": 80000
615
+ },
616
+ {
617
+ "epoch": 5.716705483802667,
618
+ "grad_norm": 0.3623177111148834,
619
+ "learning_rate": 0.00016607074482632872,
620
+ "loss": 2.077,
621
+ "step": 81000
622
+ },
623
+ {
624
+ "epoch": 5.787282094713812,
625
+ "grad_norm": 0.3382331132888794,
626
+ "learning_rate": 0.0001633357644270216,
627
+ "loss": 2.0745,
628
+ "step": 82000
629
+ },
630
+ {
631
+ "epoch": 5.857858705624956,
632
+ "grad_norm": 0.3787217140197754,
633
+ "learning_rate": 0.00016060078402771447,
634
+ "loss": 2.0768,
635
+ "step": 83000
636
+ },
637
+ {
638
+ "epoch": 5.9284353165361,
639
+ "grad_norm": 0.36773914098739624,
640
+ "learning_rate": 0.00015786580362840732,
641
+ "loss": 2.0756,
642
+ "step": 84000
643
+ },
644
+ {
645
+ "epoch": 5.999011927447244,
646
+ "grad_norm": 0.39036858081817627,
647
+ "learning_rate": 0.00015513082322910016,
648
+ "loss": 2.0706,
649
+ "step": 85000
650
+ },
651
+ {
652
+ "epoch": 6.0,
653
+ "eval_accuracy": 0.5280841264512295,
654
+ "eval_loss": 2.388700246810913,
655
+ "eval_runtime": 125.3022,
656
+ "eval_samples_per_second": 374.702,
657
+ "eval_steps_per_second": 5.858,
658
+ "step": 85014
659
+ },
660
+ {
661
+ "epoch": 6.069588538358388,
662
+ "grad_norm": 0.3616304099559784,
663
+ "learning_rate": 0.00015239584282979304,
664
+ "loss": 2.0123,
665
+ "step": 86000
666
+ },
667
+ {
668
+ "epoch": 6.140165149269532,
669
+ "grad_norm": 0.37002402544021606,
670
+ "learning_rate": 0.0001496635974108852,
671
+ "loss": 2.0153,
672
+ "step": 87000
673
+ },
674
+ {
675
+ "epoch": 6.210741760180676,
676
+ "grad_norm": 0.3603184223175049,
677
+ "learning_rate": 0.0001469286170115781,
678
+ "loss": 2.0199,
679
+ "step": 88000
680
+ },
681
+ {
682
+ "epoch": 6.28131837109182,
683
+ "grad_norm": 0.37550196051597595,
684
+ "learning_rate": 0.00014419637159267024,
685
+ "loss": 2.0138,
686
+ "step": 89000
687
+ },
688
+ {
689
+ "epoch": 6.351894982002964,
690
+ "grad_norm": 0.3768686056137085,
691
+ "learning_rate": 0.0001414641261737624,
692
+ "loss": 2.0314,
693
+ "step": 90000
694
+ },
695
+ {
696
+ "epoch": 6.422471592914108,
697
+ "grad_norm": 0.3591078221797943,
698
+ "learning_rate": 0.00013872914577445526,
699
+ "loss": 2.0226,
700
+ "step": 91000
701
+ },
702
+ {
703
+ "epoch": 6.493048203825253,
704
+ "grad_norm": 0.3905663788318634,
705
+ "learning_rate": 0.00013599416537514813,
706
+ "loss": 2.0257,
707
+ "step": 92000
708
+ },
709
+ {
710
+ "epoch": 6.5636248147363965,
711
+ "grad_norm": 0.39147230982780457,
712
+ "learning_rate": 0.00013325918497584098,
713
+ "loss": 2.0311,
714
+ "step": 93000
715
+ },
716
+ {
717
+ "epoch": 6.63420142564754,
718
+ "grad_norm": 0.40250155329704285,
719
+ "learning_rate": 0.00013052420457653385,
720
+ "loss": 2.0301,
721
+ "step": 94000
722
+ },
723
+ {
724
+ "epoch": 6.704778036558684,
725
+ "grad_norm": 0.3860897123813629,
726
+ "learning_rate": 0.00012779195915762603,
727
+ "loss": 2.0312,
728
+ "step": 95000
729
+ },
730
+ {
731
+ "epoch": 6.775354647469829,
732
+ "grad_norm": 0.3707718253135681,
733
+ "learning_rate": 0.0001250597137387182,
734
+ "loss": 2.0282,
735
+ "step": 96000
736
+ },
737
+ {
738
+ "epoch": 6.845931258380973,
739
+ "grad_norm": 0.3690090775489807,
740
+ "learning_rate": 0.00012232473333941105,
741
+ "loss": 2.0255,
742
+ "step": 97000
743
+ },
744
+ {
745
+ "epoch": 6.916507869292117,
746
+ "grad_norm": 0.4174107313156128,
747
+ "learning_rate": 0.00011958975294010391,
748
+ "loss": 2.0276,
749
+ "step": 98000
750
+ },
751
+ {
752
+ "epoch": 6.987084480203261,
753
+ "grad_norm": 0.3740842938423157,
754
+ "learning_rate": 0.00011685750752119609,
755
+ "loss": 2.0233,
756
+ "step": 99000
757
+ },
758
+ {
759
+ "epoch": 7.0,
760
+ "eval_accuracy": 0.5303788443403243,
761
+ "eval_loss": 2.3779311180114746,
762
+ "eval_runtime": 124.6993,
763
+ "eval_samples_per_second": 376.514,
764
+ "eval_steps_per_second": 5.886,
765
+ "step": 99183
766
+ },
767
+ {
768
+ "epoch": 7.057661091114404,
769
+ "grad_norm": 0.37689074873924255,
770
+ "learning_rate": 0.00011412252712188895,
771
+ "loss": 1.9805,
772
+ "step": 100000
773
+ },
774
+ {
775
+ "epoch": 7.128237702025549,
776
+ "grad_norm": 0.4047756493091583,
777
+ "learning_rate": 0.00011139028170298112,
778
+ "loss": 1.9746,
779
+ "step": 101000
780
+ },
781
+ {
782
+ "epoch": 7.198814312936693,
783
+ "grad_norm": 0.3889460563659668,
784
+ "learning_rate": 0.00010865530130367397,
785
+ "loss": 1.9659,
786
+ "step": 102000
787
+ },
788
+ {
789
+ "epoch": 7.269390923847837,
790
+ "grad_norm": 0.4061487019062042,
791
+ "learning_rate": 0.00010592032090436684,
792
+ "loss": 1.9708,
793
+ "step": 103000
794
+ },
795
+ {
796
+ "epoch": 7.339967534758981,
797
+ "grad_norm": 0.4057160019874573,
798
+ "learning_rate": 0.0001031853405050597,
799
+ "loss": 1.9772,
800
+ "step": 104000
801
+ },
802
+ {
803
+ "epoch": 7.410544145670125,
804
+ "grad_norm": 0.40489134192466736,
805
+ "learning_rate": 0.00010045309508615188,
806
+ "loss": 1.9736,
807
+ "step": 105000
808
+ },
809
+ {
810
+ "epoch": 7.481120756581269,
811
+ "grad_norm": 0.4148654043674469,
812
+ "learning_rate": 9.771811468684473e-05,
813
+ "loss": 1.9796,
814
+ "step": 106000
815
+ },
816
+ {
817
+ "epoch": 7.551697367492413,
818
+ "grad_norm": 0.41483381390571594,
819
+ "learning_rate": 9.49831342875376e-05,
820
+ "loss": 1.9804,
821
+ "step": 107000
822
+ },
823
+ {
824
+ "epoch": 7.622273978403557,
825
+ "grad_norm": 0.43718773126602173,
826
+ "learning_rate": 9.225088886862978e-05,
827
+ "loss": 1.9712,
828
+ "step": 108000
829
+ },
830
+ {
831
+ "epoch": 7.692850589314701,
832
+ "grad_norm": 0.40646329522132874,
833
+ "learning_rate": 8.951590846932262e-05,
834
+ "loss": 1.9822,
835
+ "step": 109000
836
+ },
837
+ {
838
+ "epoch": 7.763427200225845,
839
+ "grad_norm": 0.44571158289909363,
840
+ "learning_rate": 8.67836630504148e-05,
841
+ "loss": 1.9832,
842
+ "step": 110000
843
+ },
844
+ {
845
+ "epoch": 7.83400381113699,
846
+ "grad_norm": 0.41726765036582947,
847
+ "learning_rate": 8.404868265110766e-05,
848
+ "loss": 1.9747,
849
+ "step": 111000
850
+ },
851
+ {
852
+ "epoch": 7.9045804220481335,
853
+ "grad_norm": 0.39210569858551025,
854
+ "learning_rate": 8.131370225180053e-05,
855
+ "loss": 1.9929,
856
+ "step": 112000
857
+ },
858
+ {
859
+ "epoch": 7.975157032959277,
860
+ "grad_norm": 0.37121346592903137,
861
+ "learning_rate": 7.858145683289268e-05,
862
+ "loss": 1.9727,
863
+ "step": 113000
864
+ },
865
+ {
866
+ "epoch": 8.0,
867
+ "eval_accuracy": 0.5315104156523147,
868
+ "eval_loss": 2.3731467723846436,
869
+ "eval_runtime": 124.474,
870
+ "eval_samples_per_second": 377.195,
871
+ "eval_steps_per_second": 5.897,
872
+ "step": 113352
873
+ },
874
+ {
875
+ "epoch": 8.045733643870422,
876
+ "grad_norm": 0.3864983916282654,
877
+ "learning_rate": 7.584647643358555e-05,
878
+ "loss": 1.9365,
879
+ "step": 114000
880
+ },
881
+ {
882
+ "epoch": 8.116310254781565,
883
+ "grad_norm": 0.4133272171020508,
884
+ "learning_rate": 7.311423101467773e-05,
885
+ "loss": 1.9174,
886
+ "step": 115000
887
+ },
888
+ {
889
+ "epoch": 8.18688686569271,
890
+ "grad_norm": 0.4471355676651001,
891
+ "learning_rate": 7.037925061537059e-05,
892
+ "loss": 1.9327,
893
+ "step": 116000
894
+ },
895
+ {
896
+ "epoch": 8.257463476603853,
897
+ "grad_norm": 0.42897623777389526,
898
+ "learning_rate": 6.764700519646275e-05,
899
+ "loss": 1.9288,
900
+ "step": 117000
901
+ },
902
+ {
903
+ "epoch": 8.328040087514998,
904
+ "grad_norm": 0.43864506483078003,
905
+ "learning_rate": 6.491202479715561e-05,
906
+ "loss": 1.9359,
907
+ "step": 118000
908
+ },
909
+ {
910
+ "epoch": 8.398616698426142,
911
+ "grad_norm": 0.46767184138298035,
912
+ "learning_rate": 6.217977937824779e-05,
913
+ "loss": 1.9399,
914
+ "step": 119000
915
+ },
916
+ {
917
+ "epoch": 8.469193309337285,
918
+ "grad_norm": 0.4159405827522278,
919
+ "learning_rate": 5.944479897894064e-05,
920
+ "loss": 1.9318,
921
+ "step": 120000
922
+ },
923
+ {
924
+ "epoch": 8.53976992024843,
925
+ "grad_norm": 0.4233142137527466,
926
+ "learning_rate": 5.671255356003281e-05,
927
+ "loss": 1.9316,
928
+ "step": 121000
929
+ },
930
+ {
931
+ "epoch": 8.610346531159573,
932
+ "grad_norm": 0.4398587942123413,
933
+ "learning_rate": 5.3980308141124983e-05,
934
+ "loss": 1.9349,
935
+ "step": 122000
936
+ },
937
+ {
938
+ "epoch": 8.680923142070718,
939
+ "grad_norm": 0.424790620803833,
940
+ "learning_rate": 5.124532774181785e-05,
941
+ "loss": 1.9382,
942
+ "step": 123000
943
+ },
944
+ {
945
+ "epoch": 8.751499752981863,
946
+ "grad_norm": 0.4141928553581238,
947
+ "learning_rate": 4.851034734251071e-05,
948
+ "loss": 1.9314,
949
+ "step": 124000
950
+ },
951
+ {
952
+ "epoch": 8.822076363893006,
953
+ "grad_norm": 0.45448678731918335,
954
+ "learning_rate": 4.577536694320357e-05,
955
+ "loss": 1.9374,
956
+ "step": 125000
957
+ },
958
+ {
959
+ "epoch": 8.89265297480415,
960
+ "grad_norm": 0.4196777939796448,
961
+ "learning_rate": 4.304312152429574e-05,
962
+ "loss": 1.9413,
963
+ "step": 126000
964
+ },
965
+ {
966
+ "epoch": 8.963229585715293,
967
+ "grad_norm": 0.3975803256034851,
968
+ "learning_rate": 4.03081411249886e-05,
969
+ "loss": 1.9311,
970
+ "step": 127000
971
+ },
972
+ {
973
+ "epoch": 9.0,
974
+ "eval_accuracy": 0.532392202583586,
975
+ "eval_loss": 2.3728187084198,
976
+ "eval_runtime": 124.6433,
977
+ "eval_samples_per_second": 376.683,
978
+ "eval_steps_per_second": 5.889,
979
+ "step": 127521
980
+ },
981
+ {
982
+ "epoch": 9.033806196626438,
983
+ "grad_norm": 0.42701438069343567,
984
+ "learning_rate": 3.757589570608077e-05,
985
+ "loss": 1.9097,
986
+ "step": 128000
987
+ },
988
+ {
989
+ "epoch": 9.104382807537583,
990
+ "grad_norm": 0.44095513224601746,
991
+ "learning_rate": 3.484091530677363e-05,
992
+ "loss": 1.8841,
993
+ "step": 129000
994
+ },
995
+ {
996
+ "epoch": 9.174959418448726,
997
+ "grad_norm": 0.4554646909236908,
998
+ "learning_rate": 3.2108669887865805e-05,
999
+ "loss": 1.898,
1000
+ "step": 130000
1001
+ },
1002
+ {
1003
+ "epoch": 9.24553602935987,
1004
+ "grad_norm": 0.4306824803352356,
1005
+ "learning_rate": 2.937642446895797e-05,
1006
+ "loss": 1.8921,
1007
+ "step": 131000
1008
+ },
1009
+ {
1010
+ "epoch": 9.316112640271013,
1011
+ "grad_norm": 0.4496276080608368,
1012
+ "learning_rate": 2.664144406965083e-05,
1013
+ "loss": 1.8883,
1014
+ "step": 132000
1015
+ },
1016
+ {
1017
+ "epoch": 9.386689251182158,
1018
+ "grad_norm": 0.4314197301864624,
1019
+ "learning_rate": 2.3906463670343695e-05,
1020
+ "loss": 1.8972,
1021
+ "step": 133000
1022
+ },
1023
+ {
1024
+ "epoch": 9.457265862093303,
1025
+ "grad_norm": 0.4485911428928375,
1026
+ "learning_rate": 2.1171483271036556e-05,
1027
+ "loss": 1.8908,
1028
+ "step": 134000
1029
+ },
1030
+ {
1031
+ "epoch": 9.527842473004446,
1032
+ "grad_norm": 0.4166420102119446,
1033
+ "learning_rate": 1.8439237852128724e-05,
1034
+ "loss": 1.8924,
1035
+ "step": 135000
1036
+ },
1037
+ {
1038
+ "epoch": 9.59841908391559,
1039
+ "grad_norm": 0.4447433650493622,
1040
+ "learning_rate": 1.5704257452821588e-05,
1041
+ "loss": 1.892,
1042
+ "step": 136000
1043
+ },
1044
+ {
1045
+ "epoch": 9.668995694826734,
1046
+ "grad_norm": 0.42986035346984863,
1047
+ "learning_rate": 1.2972012033913756e-05,
1048
+ "loss": 1.8992,
1049
+ "step": 137000
1050
+ },
1051
+ {
1052
+ "epoch": 9.739572305737878,
1053
+ "grad_norm": 0.45574498176574707,
1054
+ "learning_rate": 1.0237031634606618e-05,
1055
+ "loss": 1.8972,
1056
+ "step": 138000
1057
+ },
1058
+ {
1059
+ "epoch": 9.810148916649023,
1060
+ "grad_norm": 0.4515238404273987,
1061
+ "learning_rate": 7.5020512352994795e-06,
1062
+ "loss": 1.8952,
1063
+ "step": 139000
1064
+ },
1065
+ {
1066
+ "epoch": 9.880725527560166,
1067
+ "grad_norm": 0.43244990706443787,
1068
+ "learning_rate": 4.769805816391649e-06,
1069
+ "loss": 1.8946,
1070
+ "step": 140000
1071
+ },
1072
+ {
1073
+ "epoch": 9.951302138471311,
1074
+ "grad_norm": 0.44382914900779724,
1075
+ "learning_rate": 2.0348254170845108e-06,
1076
+ "loss": 1.8943,
1077
+ "step": 141000
1078
+ },
1079
+ {
1080
+ "epoch": 10.0,
1081
+ "eval_accuracy": 0.5327396962492645,
1082
+ "eval_loss": 2.377573251724243,
1083
+ "eval_runtime": 124.448,
1084
+ "eval_samples_per_second": 377.274,
1085
+ "eval_steps_per_second": 5.898,
1086
+ "step": 141690
1087
+ },
1088
+ {
1089
+ "epoch": 10.0,
1090
+ "step": 141690,
1091
+ "total_flos": 5.9695438005504e+17,
1092
+ "train_loss": 2.237272139724321,
1093
+ "train_runtime": 31036.7572,
1094
+ "train_samples_per_second": 146.079,
1095
+ "train_steps_per_second": 4.565
1096
+ }
1097
+ ],
1098
+ "logging_steps": 1000,
1099
+ "max_steps": 141690,
1100
+ "num_input_tokens_seen": 0,
1101
+ "num_train_epochs": 10,
1102
+ "save_steps": 5000,
1103
+ "stateful_callbacks": {
1104
+ "TrainerControl": {
1105
+ "args": {
1106
+ "should_epoch_stop": false,
1107
+ "should_evaluate": false,
1108
+ "should_log": false,
1109
+ "should_save": true,
1110
+ "should_training_stop": true
1111
+ },
1112
+ "attributes": {}
1113
+ }
1114
+ },
1115
+ "total_flos": 5.9695438005504e+17,
1116
+ "train_batch_size": 32,
1117
+ "trial_name": null,
1118
+ "trial_params": null
1119
+ }