DewiBrynJones commited on
Commit
ddfd34c
1 Parent(s): 39b2f20

End of training

Browse files
README.md CHANGED
@@ -2,6 +2,8 @@
2
  license: apache-2.0
3
  base_model: facebook/wav2vec2-large-xlsr-53
4
  tags:
 
 
5
  - generated_from_trainer
6
  metrics:
7
  - wer
@@ -15,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  # wav2vec2-xlsr-53-ft-btb-ccv-enc-cy
17
 
18
- This model is a fine-tuned version of [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
  - Loss: 0.4095
21
  - Wer: 0.3271
 
2
  license: apache-2.0
3
  base_model: facebook/wav2vec2-large-xlsr-53
4
  tags:
5
+ - automatic-speech-recognition
6
+ - DewiBrynJones/banc-trawsgrifiadau-bangor-clean-with-ccv
7
  - generated_from_trainer
8
  metrics:
9
  - wer
 
17
 
18
  # wav2vec2-xlsr-53-ft-btb-ccv-enc-cy
19
 
20
+ This model is a fine-tuned version of [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on the DEWIBRYNJONES/BANC-TRAWSGRIFIADAU-BANGOR-CLEAN-WITH-CCV - DEFAULT dataset.
21
  It achieves the following results on the evaluation set:
22
  - Loss: 0.4095
23
  - Wer: 0.3271
all_results.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.9357336430507162,
3
+ "eval_loss": 0.409473717212677,
4
+ "eval_runtime": 147.313,
5
+ "eval_samples": 5656,
6
+ "eval_samples_per_second": 38.394,
7
+ "eval_steps_per_second": 4.799,
8
+ "eval_wer": 0.3270690568278474,
9
+ "total_flos": 1.1255918428180738e+19,
10
+ "train_loss": 0.7365739318847656,
11
+ "train_runtime": 19473.5524,
12
+ "train_samples": 41326,
13
+ "train_samples_per_second": 4.108,
14
+ "train_steps_per_second": 0.514
15
+ }
eval_results.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.9357336430507162,
3
+ "eval_loss": 0.409473717212677,
4
+ "eval_runtime": 147.313,
5
+ "eval_samples": 5656,
6
+ "eval_samples_per_second": 38.394,
7
+ "eval_steps_per_second": 4.799,
8
+ "eval_wer": 0.3270690568278474
9
+ }
runs/Jun26_08-42-00_b5ff05b1c2c9/events.out.tfevents.1719407922.b5ff05b1c2c9.31.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7dc754c2fbdec3308e9a76e572e132d646e204d217a81300d72cb7aba427f5b1
3
+ size 406
train_results.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.9357336430507162,
3
+ "total_flos": 1.1255918428180738e+19,
4
+ "train_loss": 0.7365739318847656,
5
+ "train_runtime": 19473.5524,
6
+ "train_samples": 41326,
7
+ "train_samples_per_second": 4.108,
8
+ "train_steps_per_second": 0.514
9
+ }
trainer_state.json ADDED
@@ -0,0 +1,1082 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 1.9357336430507162,
5
+ "eval_steps": 100,
6
+ "global_step": 10000,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 0.019357336430507164,
13
+ "eval_loss": 3.547485828399658,
14
+ "eval_runtime": 144.8623,
15
+ "eval_samples_per_second": 39.044,
16
+ "eval_steps_per_second": 4.88,
17
+ "eval_wer": 1.0,
18
+ "step": 100
19
+ },
20
+ {
21
+ "epoch": 0.03871467286101433,
22
+ "eval_loss": 3.0259251594543457,
23
+ "eval_runtime": 142.9174,
24
+ "eval_samples_per_second": 39.575,
25
+ "eval_steps_per_second": 4.947,
26
+ "eval_wer": 1.0,
27
+ "step": 200
28
+ },
29
+ {
30
+ "epoch": 0.05807200929152149,
31
+ "eval_loss": 3.0886833667755127,
32
+ "eval_runtime": 141.7177,
33
+ "eval_samples_per_second": 39.91,
34
+ "eval_steps_per_second": 4.989,
35
+ "eval_wer": 1.0,
36
+ "step": 300
37
+ },
38
+ {
39
+ "epoch": 0.07742934572202866,
40
+ "eval_loss": 2.3821566104888916,
41
+ "eval_runtime": 143.4279,
42
+ "eval_samples_per_second": 39.434,
43
+ "eval_steps_per_second": 4.929,
44
+ "eval_wer": 0.9971915071175234,
45
+ "step": 400
46
+ },
47
+ {
48
+ "epoch": 0.09678668215253582,
49
+ "grad_norm": 3.033390760421753,
50
+ "learning_rate": 0.0002982,
51
+ "loss": 4.0938,
52
+ "step": 500
53
+ },
54
+ {
55
+ "epoch": 0.09678668215253582,
56
+ "eval_loss": 1.4546788930892944,
57
+ "eval_runtime": 142.7727,
58
+ "eval_samples_per_second": 39.615,
59
+ "eval_steps_per_second": 4.952,
60
+ "eval_wer": 0.9020076711977019,
61
+ "step": 500
62
+ },
63
+ {
64
+ "epoch": 0.11614401858304298,
65
+ "eval_loss": 1.2602813243865967,
66
+ "eval_runtime": 143.0991,
67
+ "eval_samples_per_second": 39.525,
68
+ "eval_steps_per_second": 4.941,
69
+ "eval_wer": 0.8509733433904126,
70
+ "step": 600
71
+ },
72
+ {
73
+ "epoch": 0.13550135501355012,
74
+ "eval_loss": 1.0939536094665527,
75
+ "eval_runtime": 145.4158,
76
+ "eval_samples_per_second": 38.895,
77
+ "eval_steps_per_second": 4.862,
78
+ "eval_wer": 0.7654667715170677,
79
+ "step": 700
80
+ },
81
+ {
82
+ "epoch": 0.1548586914440573,
83
+ "eval_loss": 1.0704576969146729,
84
+ "eval_runtime": 148.1277,
85
+ "eval_samples_per_second": 38.183,
86
+ "eval_steps_per_second": 4.773,
87
+ "eval_wer": 0.7601547078364975,
88
+ "step": 800
89
+ },
90
+ {
91
+ "epoch": 0.17421602787456447,
92
+ "eval_loss": 0.9356458187103271,
93
+ "eval_runtime": 143.6296,
94
+ "eval_samples_per_second": 39.379,
95
+ "eval_steps_per_second": 4.922,
96
+ "eval_wer": 0.6972926128612925,
97
+ "step": 900
98
+ },
99
+ {
100
+ "epoch": 0.19357336430507163,
101
+ "grad_norm": 3.2989861965179443,
102
+ "learning_rate": 0.0002843684210526315,
103
+ "loss": 1.0597,
104
+ "step": 1000
105
+ },
106
+ {
107
+ "epoch": 0.19357336430507163,
108
+ "eval_loss": 0.9103516936302185,
109
+ "eval_runtime": 146.7237,
110
+ "eval_samples_per_second": 38.549,
111
+ "eval_steps_per_second": 4.819,
112
+ "eval_wer": 0.6765579111232367,
113
+ "step": 1000
114
+ },
115
+ {
116
+ "epoch": 0.2129307007355788,
117
+ "eval_loss": 0.8879104256629944,
118
+ "eval_runtime": 153.6385,
119
+ "eval_samples_per_second": 36.814,
120
+ "eval_steps_per_second": 4.602,
121
+ "eval_wer": 0.6569947521304424,
122
+ "step": 1100
123
+ },
124
+ {
125
+ "epoch": 0.23228803716608595,
126
+ "eval_loss": 0.8594633936882019,
127
+ "eval_runtime": 147.9212,
128
+ "eval_samples_per_second": 38.237,
129
+ "eval_steps_per_second": 4.78,
130
+ "eval_wer": 0.6611834186580219,
131
+ "step": 1200
132
+ },
133
+ {
134
+ "epoch": 0.2516453735965931,
135
+ "eval_loss": 0.8351845145225525,
136
+ "eval_runtime": 148.1861,
137
+ "eval_samples_per_second": 38.168,
138
+ "eval_steps_per_second": 4.771,
139
+ "eval_wer": 0.6075331803373402,
140
+ "step": 1300
141
+ },
142
+ {
143
+ "epoch": 0.27100271002710025,
144
+ "eval_loss": 0.791232168674469,
145
+ "eval_runtime": 148.9749,
146
+ "eval_samples_per_second": 37.966,
147
+ "eval_steps_per_second": 4.746,
148
+ "eval_wer": 0.6033124167482466,
149
+ "step": 1400
150
+ },
151
+ {
152
+ "epoch": 0.29036004645760743,
153
+ "grad_norm": 6.474522113800049,
154
+ "learning_rate": 0.000268578947368421,
155
+ "loss": 0.8484,
156
+ "step": 1500
157
+ },
158
+ {
159
+ "epoch": 0.29036004645760743,
160
+ "eval_loss": 0.7862286567687988,
161
+ "eval_runtime": 146.7521,
162
+ "eval_samples_per_second": 38.541,
163
+ "eval_steps_per_second": 4.818,
164
+ "eval_wer": 0.6067468023302467,
165
+ "step": 1500
166
+ },
167
+ {
168
+ "epoch": 0.3097173828881146,
169
+ "eval_loss": 0.7790109515190125,
170
+ "eval_runtime": 147.7062,
171
+ "eval_samples_per_second": 38.292,
172
+ "eval_steps_per_second": 4.787,
173
+ "eval_wer": 0.6009051371346953,
174
+ "step": 1600
175
+ },
176
+ {
177
+ "epoch": 0.32907471931862176,
178
+ "eval_loss": 0.7678210735321045,
179
+ "eval_runtime": 148.3951,
180
+ "eval_samples_per_second": 38.114,
181
+ "eval_steps_per_second": 4.764,
182
+ "eval_wer": 0.5629182648328546,
183
+ "step": 1700
184
+ },
185
+ {
186
+ "epoch": 0.34843205574912894,
187
+ "eval_loss": 0.7514644861221313,
188
+ "eval_runtime": 149.2674,
189
+ "eval_samples_per_second": 37.892,
190
+ "eval_steps_per_second": 4.736,
191
+ "eval_wer": 0.5798655133122562,
192
+ "step": 1800
193
+ },
194
+ {
195
+ "epoch": 0.3677893921796361,
196
+ "eval_loss": 0.7423551678657532,
197
+ "eval_runtime": 149.3427,
198
+ "eval_samples_per_second": 37.873,
199
+ "eval_steps_per_second": 4.734,
200
+ "eval_wer": 0.5859158094076488,
201
+ "step": 1900
202
+ },
203
+ {
204
+ "epoch": 0.38714672861014326,
205
+ "grad_norm": 2.573913335800171,
206
+ "learning_rate": 0.0002527894736842105,
207
+ "loss": 0.764,
208
+ "step": 2000
209
+ },
210
+ {
211
+ "epoch": 0.38714672861014326,
212
+ "eval_loss": 0.7129915356636047,
213
+ "eval_runtime": 148.3711,
214
+ "eval_samples_per_second": 38.121,
215
+ "eval_steps_per_second": 4.765,
216
+ "eval_wer": 0.5520855065718734,
217
+ "step": 2000
218
+ },
219
+ {
220
+ "epoch": 0.4065040650406504,
221
+ "eval_loss": 0.7114368677139282,
222
+ "eval_runtime": 148.2007,
223
+ "eval_samples_per_second": 38.164,
224
+ "eval_steps_per_second": 4.771,
225
+ "eval_wer": 0.5407712923881819,
226
+ "step": 2100
227
+ },
228
+ {
229
+ "epoch": 0.4258614014711576,
230
+ "eval_loss": 0.7228682637214661,
231
+ "eval_runtime": 149.1432,
232
+ "eval_samples_per_second": 37.923,
233
+ "eval_steps_per_second": 4.74,
234
+ "eval_wer": 0.5577024923368266,
235
+ "step": 2200
236
+ },
237
+ {
238
+ "epoch": 0.4452187379016647,
239
+ "eval_loss": 0.6773180961608887,
240
+ "eval_runtime": 154.221,
241
+ "eval_samples_per_second": 36.675,
242
+ "eval_steps_per_second": 4.584,
243
+ "eval_wer": 0.5160084094301167,
244
+ "step": 2300
245
+ },
246
+ {
247
+ "epoch": 0.4645760743321719,
248
+ "eval_loss": 0.6784498691558838,
249
+ "eval_runtime": 149.2744,
250
+ "eval_samples_per_second": 37.89,
251
+ "eval_steps_per_second": 4.736,
252
+ "eval_wer": 0.5177897963441447,
253
+ "step": 2400
254
+ },
255
+ {
256
+ "epoch": 0.48393341076267904,
257
+ "grad_norm": 3.0869553089141846,
258
+ "learning_rate": 0.000237,
259
+ "loss": 0.6868,
260
+ "step": 2500
261
+ },
262
+ {
263
+ "epoch": 0.48393341076267904,
264
+ "eval_loss": 0.672030508518219,
265
+ "eval_runtime": 149.4453,
266
+ "eval_samples_per_second": 37.847,
267
+ "eval_steps_per_second": 4.731,
268
+ "eval_wer": 0.5261831779300605,
269
+ "step": 2500
270
+ },
271
+ {
272
+ "epoch": 0.5032907471931862,
273
+ "eval_loss": 0.6804332137107849,
274
+ "eval_runtime": 151.0327,
275
+ "eval_samples_per_second": 37.449,
276
+ "eval_steps_per_second": 4.681,
277
+ "eval_wer": 0.5336617932628268,
278
+ "step": 2600
279
+ },
280
+ {
281
+ "epoch": 0.5226480836236934,
282
+ "eval_loss": 0.6598911285400391,
283
+ "eval_runtime": 149.0299,
284
+ "eval_samples_per_second": 37.952,
285
+ "eval_steps_per_second": 4.744,
286
+ "eval_wer": 0.5023832068174159,
287
+ "step": 2700
288
+ },
289
+ {
290
+ "epoch": 0.5420054200542005,
291
+ "eval_loss": 0.6287100911140442,
292
+ "eval_runtime": 149.9845,
293
+ "eval_samples_per_second": 37.711,
294
+ "eval_steps_per_second": 4.714,
295
+ "eval_wer": 0.4902023719728459,
296
+ "step": 2800
297
+ },
298
+ {
299
+ "epoch": 0.5613627564847077,
300
+ "eval_loss": 0.6304338574409485,
301
+ "eval_runtime": 150.016,
302
+ "eval_samples_per_second": 37.703,
303
+ "eval_steps_per_second": 4.713,
304
+ "eval_wer": 0.49471200911556545,
305
+ "step": 2900
306
+ },
307
+ {
308
+ "epoch": 0.5807200929152149,
309
+ "grad_norm": 5.678714275360107,
310
+ "learning_rate": 0.00022121052631578946,
311
+ "loss": 0.6761,
312
+ "step": 3000
313
+ },
314
+ {
315
+ "epoch": 0.5807200929152149,
316
+ "eval_loss": 0.6258472204208374,
317
+ "eval_runtime": 149.8088,
318
+ "eval_samples_per_second": 37.755,
319
+ "eval_steps_per_second": 4.719,
320
+ "eval_wer": 0.48513103625363097,
321
+ "step": 3000
322
+ },
323
+ {
324
+ "epoch": 0.6000774293457221,
325
+ "eval_loss": 0.6310975551605225,
326
+ "eval_runtime": 148.9286,
327
+ "eval_samples_per_second": 37.978,
328
+ "eval_steps_per_second": 4.747,
329
+ "eval_wer": 0.4989809182969299,
330
+ "step": 3100
331
+ },
332
+ {
333
+ "epoch": 0.6194347657762292,
334
+ "eval_loss": 0.6171565651893616,
335
+ "eval_runtime": 148.6924,
336
+ "eval_samples_per_second": 38.038,
337
+ "eval_steps_per_second": 4.755,
338
+ "eval_wer": 0.4901060807883038,
339
+ "step": 3200
340
+ },
341
+ {
342
+ "epoch": 0.6387921022067363,
343
+ "eval_loss": 0.6187321543693542,
344
+ "eval_runtime": 149.7679,
345
+ "eval_samples_per_second": 37.765,
346
+ "eval_steps_per_second": 4.721,
347
+ "eval_wer": 0.46661103176004237,
348
+ "step": 3300
349
+ },
350
+ {
351
+ "epoch": 0.6581494386372435,
352
+ "eval_loss": 0.6044796109199524,
353
+ "eval_runtime": 149.5983,
354
+ "eval_samples_per_second": 37.808,
355
+ "eval_steps_per_second": 4.726,
356
+ "eval_wer": 0.4725489881401358,
357
+ "step": 3400
358
+ },
359
+ {
360
+ "epoch": 0.6775067750677507,
361
+ "grad_norm": 4.122500419616699,
362
+ "learning_rate": 0.00020542105263157893,
363
+ "loss": 0.6462,
364
+ "step": 3500
365
+ },
366
+ {
367
+ "epoch": 0.6775067750677507,
368
+ "eval_loss": 0.5950499773025513,
369
+ "eval_runtime": 148.2511,
370
+ "eval_samples_per_second": 38.151,
371
+ "eval_steps_per_second": 4.769,
372
+ "eval_wer": 0.4716823674792573,
373
+ "step": 3500
374
+ },
375
+ {
376
+ "epoch": 0.6968641114982579,
377
+ "eval_loss": 0.5902624726295471,
378
+ "eval_runtime": 149.3094,
379
+ "eval_samples_per_second": 37.881,
380
+ "eval_steps_per_second": 4.735,
381
+ "eval_wer": 0.4602237165187527,
382
+ "step": 3600
383
+ },
384
+ {
385
+ "epoch": 0.716221447928765,
386
+ "eval_loss": 0.5864866375923157,
387
+ "eval_runtime": 149.6434,
388
+ "eval_samples_per_second": 37.797,
389
+ "eval_steps_per_second": 4.725,
390
+ "eval_wer": 0.47267737638619184,
391
+ "step": 3700
392
+ },
393
+ {
394
+ "epoch": 0.7355787843592722,
395
+ "eval_loss": 0.5820363759994507,
396
+ "eval_runtime": 148.886,
397
+ "eval_samples_per_second": 37.989,
398
+ "eval_steps_per_second": 4.749,
399
+ "eval_wer": 0.459036125242734,
400
+ "step": 3800
401
+ },
402
+ {
403
+ "epoch": 0.7549361207897793,
404
+ "eval_loss": 0.6025602221488953,
405
+ "eval_runtime": 148.9627,
406
+ "eval_samples_per_second": 37.969,
407
+ "eval_steps_per_second": 4.746,
408
+ "eval_wer": 0.48296448460143476,
409
+ "step": 3900
410
+ },
411
+ {
412
+ "epoch": 0.7742934572202865,
413
+ "grad_norm": 5.146019458770752,
414
+ "learning_rate": 0.0001896315789473684,
415
+ "loss": 0.6193,
416
+ "step": 4000
417
+ },
418
+ {
419
+ "epoch": 0.7742934572202865,
420
+ "eval_loss": 0.5807139277458191,
421
+ "eval_runtime": 147.9966,
422
+ "eval_samples_per_second": 38.217,
423
+ "eval_steps_per_second": 4.777,
424
+ "eval_wer": 0.44963168621912664,
425
+ "step": 4000
426
+ },
427
+ {
428
+ "epoch": 0.7936507936507936,
429
+ "eval_loss": 0.5620962977409363,
430
+ "eval_runtime": 148.8391,
431
+ "eval_samples_per_second": 38.001,
432
+ "eval_steps_per_second": 4.75,
433
+ "eval_wer": 0.44857248318916404,
434
+ "step": 4100
435
+ },
436
+ {
437
+ "epoch": 0.8130081300813008,
438
+ "eval_loss": 0.5730157494544983,
439
+ "eval_runtime": 148.9808,
440
+ "eval_samples_per_second": 37.965,
441
+ "eval_steps_per_second": 4.746,
442
+ "eval_wer": 0.4593410473271172,
443
+ "step": 4200
444
+ },
445
+ {
446
+ "epoch": 0.832365466511808,
447
+ "eval_loss": 0.5592055916786194,
448
+ "eval_runtime": 147.8897,
449
+ "eval_samples_per_second": 38.245,
450
+ "eval_steps_per_second": 4.781,
451
+ "eval_wer": 0.43741875431304267,
452
+ "step": 4300
453
+ },
454
+ {
455
+ "epoch": 0.8517228029423152,
456
+ "eval_loss": 0.5621338486671448,
457
+ "eval_runtime": 148.7799,
458
+ "eval_samples_per_second": 38.016,
459
+ "eval_steps_per_second": 4.752,
460
+ "eval_wer": 0.42387379435412686,
461
+ "step": 4400
462
+ },
463
+ {
464
+ "epoch": 0.8710801393728222,
465
+ "grad_norm": 2.8218295574188232,
466
+ "learning_rate": 0.0001738421052631579,
467
+ "loss": 0.59,
468
+ "step": 4500
469
+ },
470
+ {
471
+ "epoch": 0.8710801393728222,
472
+ "eval_loss": 0.545798659324646,
473
+ "eval_runtime": 150.2397,
474
+ "eval_samples_per_second": 37.647,
475
+ "eval_steps_per_second": 4.706,
476
+ "eval_wer": 0.4304055463722296,
477
+ "step": 4500
478
+ },
479
+ {
480
+ "epoch": 0.8904374758033294,
481
+ "eval_loss": 0.5406409502029419,
482
+ "eval_runtime": 148.931,
483
+ "eval_samples_per_second": 37.977,
484
+ "eval_steps_per_second": 4.747,
485
+ "eval_wer": 0.4270674519747717,
486
+ "step": 4600
487
+ },
488
+ {
489
+ "epoch": 0.9097948122338366,
490
+ "eval_loss": 0.5268651247024536,
491
+ "eval_runtime": 148.7725,
492
+ "eval_samples_per_second": 38.018,
493
+ "eval_steps_per_second": 4.752,
494
+ "eval_wer": 0.41315337580844474,
495
+ "step": 4700
496
+ },
497
+ {
498
+ "epoch": 0.9291521486643438,
499
+ "eval_loss": 0.5362106561660767,
500
+ "eval_runtime": 147.8165,
501
+ "eval_samples_per_second": 38.264,
502
+ "eval_steps_per_second": 4.783,
503
+ "eval_wer": 0.4214665147405755,
504
+ "step": 4800
505
+ },
506
+ {
507
+ "epoch": 0.948509485094851,
508
+ "eval_loss": 0.5226009488105774,
509
+ "eval_runtime": 149.4387,
510
+ "eval_samples_per_second": 37.848,
511
+ "eval_steps_per_second": 4.731,
512
+ "eval_wer": 0.41626679077530454,
513
+ "step": 4900
514
+ },
515
+ {
516
+ "epoch": 0.9678668215253581,
517
+ "grad_norm": 7.241621017456055,
518
+ "learning_rate": 0.00015808421052631577,
519
+ "loss": 0.5636,
520
+ "step": 5000
521
+ },
522
+ {
523
+ "epoch": 0.9678668215253581,
524
+ "eval_loss": 0.5297274589538574,
525
+ "eval_runtime": 149.3726,
526
+ "eval_samples_per_second": 37.865,
527
+ "eval_steps_per_second": 4.733,
528
+ "eval_wer": 0.4148384715379307,
529
+ "step": 5000
530
+ },
531
+ {
532
+ "epoch": 0.9872241579558653,
533
+ "eval_loss": 0.5225785970687866,
534
+ "eval_runtime": 149.3039,
535
+ "eval_samples_per_second": 37.882,
536
+ "eval_steps_per_second": 4.735,
537
+ "eval_wer": 0.413634831731155,
538
+ "step": 5100
539
+ },
540
+ {
541
+ "epoch": 1.0065814943863725,
542
+ "eval_loss": 0.5239331722259521,
543
+ "eval_runtime": 149.3008,
544
+ "eval_samples_per_second": 37.883,
545
+ "eval_steps_per_second": 4.735,
546
+ "eval_wer": 0.4054179839835663,
547
+ "step": 5200
548
+ },
549
+ {
550
+ "epoch": 1.0259388308168795,
551
+ "eval_loss": 0.5382751226425171,
552
+ "eval_runtime": 148.0837,
553
+ "eval_samples_per_second": 38.195,
554
+ "eval_steps_per_second": 4.774,
555
+ "eval_wer": 0.4057871001909775,
556
+ "step": 5300
557
+ },
558
+ {
559
+ "epoch": 1.0452961672473868,
560
+ "eval_loss": 0.5125272274017334,
561
+ "eval_runtime": 149.1205,
562
+ "eval_samples_per_second": 37.929,
563
+ "eval_steps_per_second": 4.741,
564
+ "eval_wer": 0.4067179149748841,
565
+ "step": 5400
566
+ },
567
+ {
568
+ "epoch": 1.064653503677894,
569
+ "grad_norm": 1.124423623085022,
570
+ "learning_rate": 0.00014232631578947366,
571
+ "loss": 0.4924,
572
+ "step": 5500
573
+ },
574
+ {
575
+ "epoch": 1.064653503677894,
576
+ "eval_loss": 0.5029215812683105,
577
+ "eval_runtime": 147.7988,
578
+ "eval_samples_per_second": 38.268,
579
+ "eval_steps_per_second": 4.784,
580
+ "eval_wer": 0.39533950666816453,
581
+ "step": 5500
582
+ },
583
+ {
584
+ "epoch": 1.084010840108401,
585
+ "eval_loss": 0.505442202091217,
586
+ "eval_runtime": 149.3815,
587
+ "eval_samples_per_second": 37.863,
588
+ "eval_steps_per_second": 4.733,
589
+ "eval_wer": 0.3932050520774823,
590
+ "step": 5600
591
+ },
592
+ {
593
+ "epoch": 1.1033681765389083,
594
+ "eval_loss": 0.4968744218349457,
595
+ "eval_runtime": 149.7956,
596
+ "eval_samples_per_second": 37.758,
597
+ "eval_steps_per_second": 4.72,
598
+ "eval_wer": 0.3894015502880711,
599
+ "step": 5700
600
+ },
601
+ {
602
+ "epoch": 1.1227255129694154,
603
+ "eval_loss": 0.49354633688926697,
604
+ "eval_runtime": 148.4196,
605
+ "eval_samples_per_second": 38.108,
606
+ "eval_steps_per_second": 4.764,
607
+ "eval_wer": 0.38508449551443563,
608
+ "step": 5800
609
+ },
610
+ {
611
+ "epoch": 1.1420828493999227,
612
+ "eval_loss": 0.49766939878463745,
613
+ "eval_runtime": 149.5302,
614
+ "eval_samples_per_second": 37.825,
615
+ "eval_steps_per_second": 4.728,
616
+ "eval_wer": 0.3816501099324357,
617
+ "step": 5900
618
+ },
619
+ {
620
+ "epoch": 1.1614401858304297,
621
+ "grad_norm": 1.9677255153656006,
622
+ "learning_rate": 0.00012653684210526316,
623
+ "loss": 0.4602,
624
+ "step": 6000
625
+ },
626
+ {
627
+ "epoch": 1.1614401858304297,
628
+ "eval_loss": 0.4862758219242096,
629
+ "eval_runtime": 150.9135,
630
+ "eval_samples_per_second": 37.478,
631
+ "eval_steps_per_second": 4.685,
632
+ "eval_wer": 0.387395483943445,
633
+ "step": 6000
634
+ },
635
+ {
636
+ "epoch": 1.1807975222609368,
637
+ "eval_loss": 0.4906172454357147,
638
+ "eval_runtime": 148.5353,
639
+ "eval_samples_per_second": 38.078,
640
+ "eval_steps_per_second": 4.76,
641
+ "eval_wer": 0.3776700743046974,
642
+ "step": 6100
643
+ },
644
+ {
645
+ "epoch": 1.2001548586914441,
646
+ "eval_loss": 0.4891129434108734,
647
+ "eval_runtime": 149.6289,
648
+ "eval_samples_per_second": 37.8,
649
+ "eval_steps_per_second": 4.725,
650
+ "eval_wer": 0.3763861918441367,
651
+ "step": 6200
652
+ },
653
+ {
654
+ "epoch": 1.2195121951219512,
655
+ "eval_loss": 0.488125741481781,
656
+ "eval_runtime": 148.288,
657
+ "eval_samples_per_second": 38.142,
658
+ "eval_steps_per_second": 4.768,
659
+ "eval_wer": 0.3800934024490058,
660
+ "step": 6300
661
+ },
662
+ {
663
+ "epoch": 1.2388695315524583,
664
+ "eval_loss": 0.48135778307914734,
665
+ "eval_runtime": 147.6048,
666
+ "eval_samples_per_second": 38.319,
667
+ "eval_steps_per_second": 4.79,
668
+ "eval_wer": 0.37266293270851053,
669
+ "step": 6400
670
+ },
671
+ {
672
+ "epoch": 1.2582268679829656,
673
+ "grad_norm": 1.2907174825668335,
674
+ "learning_rate": 0.00011074736842105263,
675
+ "loss": 0.4407,
676
+ "step": 6500
677
+ },
678
+ {
679
+ "epoch": 1.2582268679829656,
680
+ "eval_loss": 0.47142112255096436,
681
+ "eval_runtime": 147.9064,
682
+ "eval_samples_per_second": 38.24,
683
+ "eval_steps_per_second": 4.78,
684
+ "eval_wer": 0.37723676397425815,
685
+ "step": 6500
686
+ },
687
+ {
688
+ "epoch": 1.2775842044134726,
689
+ "eval_loss": 0.47389352321624756,
690
+ "eval_runtime": 146.8545,
691
+ "eval_samples_per_second": 38.514,
692
+ "eval_steps_per_second": 4.814,
693
+ "eval_wer": 0.3705605751793423,
694
+ "step": 6600
695
+ },
696
+ {
697
+ "epoch": 1.29694154084398,
698
+ "eval_loss": 0.4691925644874573,
699
+ "eval_runtime": 146.6568,
700
+ "eval_samples_per_second": 38.566,
701
+ "eval_steps_per_second": 4.821,
702
+ "eval_wer": 0.3713790502479498,
703
+ "step": 6700
704
+ },
705
+ {
706
+ "epoch": 1.316298877274487,
707
+ "eval_loss": 0.4672953486442566,
708
+ "eval_runtime": 146.8165,
709
+ "eval_samples_per_second": 38.524,
710
+ "eval_steps_per_second": 4.816,
711
+ "eval_wer": 0.3728073694853236,
712
+ "step": 6800
713
+ },
714
+ {
715
+ "epoch": 1.3356562137049943,
716
+ "eval_loss": 0.46098417043685913,
717
+ "eval_runtime": 147.3051,
718
+ "eval_samples_per_second": 38.397,
719
+ "eval_steps_per_second": 4.8,
720
+ "eval_wer": 0.36780022788913674,
721
+ "step": 6900
722
+ },
723
+ {
724
+ "epoch": 1.3550135501355014,
725
+ "grad_norm": 0.8472552299499512,
726
+ "learning_rate": 9.49578947368421e-05,
727
+ "loss": 0.4284,
728
+ "step": 7000
729
+ },
730
+ {
731
+ "epoch": 1.3550135501355014,
732
+ "eval_loss": 0.47299668192863464,
733
+ "eval_runtime": 151.323,
734
+ "eval_samples_per_second": 37.377,
735
+ "eval_steps_per_second": 4.672,
736
+ "eval_wer": 0.36531270562180035,
737
+ "step": 7000
738
+ },
739
+ {
740
+ "epoch": 1.3743708865660085,
741
+ "eval_loss": 0.46056076884269714,
742
+ "eval_runtime": 146.0139,
743
+ "eval_samples_per_second": 38.736,
744
+ "eval_steps_per_second": 4.842,
745
+ "eval_wer": 0.36399672609972555,
746
+ "step": 7100
747
+ },
748
+ {
749
+ "epoch": 1.3937282229965158,
750
+ "eval_loss": 0.4571812152862549,
751
+ "eval_runtime": 146.7792,
752
+ "eval_samples_per_second": 38.534,
753
+ "eval_steps_per_second": 4.817,
754
+ "eval_wer": 0.3620067082858564,
755
+ "step": 7200
756
+ },
757
+ {
758
+ "epoch": 1.4130855594270229,
759
+ "eval_loss": 0.45746785402297974,
760
+ "eval_runtime": 146.8097,
761
+ "eval_samples_per_second": 38.526,
762
+ "eval_steps_per_second": 4.816,
763
+ "eval_wer": 0.362969620131277,
764
+ "step": 7300
765
+ },
766
+ {
767
+ "epoch": 1.43244289585753,
768
+ "eval_loss": 0.45778077840805054,
769
+ "eval_runtime": 146.9433,
770
+ "eval_samples_per_second": 38.491,
771
+ "eval_steps_per_second": 4.811,
772
+ "eval_wer": 0.3590216815650527,
773
+ "step": 7400
774
+ },
775
+ {
776
+ "epoch": 1.4518002322880372,
777
+ "grad_norm": 0.9635696411132812,
778
+ "learning_rate": 7.916842105263156e-05,
779
+ "loss": 0.4299,
780
+ "step": 7500
781
+ },
782
+ {
783
+ "epoch": 1.4518002322880372,
784
+ "eval_loss": 0.4477390646934509,
785
+ "eval_runtime": 146.7454,
786
+ "eval_samples_per_second": 38.543,
787
+ "eval_steps_per_second": 4.818,
788
+ "eval_wer": 0.3569193240358845,
789
+ "step": 7500
790
+ },
791
+ {
792
+ "epoch": 1.4711575687185443,
793
+ "eval_loss": 0.4441732168197632,
794
+ "eval_runtime": 147.4263,
795
+ "eval_samples_per_second": 38.365,
796
+ "eval_steps_per_second": 4.796,
797
+ "eval_wer": 0.3551700341833705,
798
+ "step": 7600
799
+ },
800
+ {
801
+ "epoch": 1.4905149051490514,
802
+ "eval_loss": 0.4420062303543091,
803
+ "eval_runtime": 146.725,
804
+ "eval_samples_per_second": 38.548,
805
+ "eval_steps_per_second": 4.819,
806
+ "eval_wer": 0.3546083356068752,
807
+ "step": 7700
808
+ },
809
+ {
810
+ "epoch": 1.5098722415795587,
811
+ "eval_loss": 0.4436999559402466,
812
+ "eval_runtime": 145.7818,
813
+ "eval_samples_per_second": 38.798,
814
+ "eval_steps_per_second": 4.85,
815
+ "eval_wer": 0.3482531174270995,
816
+ "step": 7800
817
+ },
818
+ {
819
+ "epoch": 1.5292295780100658,
820
+ "eval_loss": 0.43728071451187134,
821
+ "eval_runtime": 146.721,
822
+ "eval_samples_per_second": 38.549,
823
+ "eval_steps_per_second": 4.819,
824
+ "eval_wer": 0.3485740880422397,
825
+ "step": 7900
826
+ },
827
+ {
828
+ "epoch": 1.5485869144405728,
829
+ "grad_norm": 1.1358890533447266,
830
+ "learning_rate": 6.341052631578946e-05,
831
+ "loss": 0.408,
832
+ "step": 8000
833
+ },
834
+ {
835
+ "epoch": 1.5485869144405728,
836
+ "eval_loss": 0.4335756301879883,
837
+ "eval_runtime": 146.7599,
838
+ "eval_samples_per_second": 38.539,
839
+ "eval_steps_per_second": 4.817,
840
+ "eval_wer": 0.3464075363900435,
841
+ "step": 8000
842
+ },
843
+ {
844
+ "epoch": 1.5679442508710801,
845
+ "eval_loss": 0.4347936511039734,
846
+ "eval_runtime": 146.7423,
847
+ "eval_samples_per_second": 38.544,
848
+ "eval_steps_per_second": 4.818,
849
+ "eval_wer": 0.34475453772207154,
850
+ "step": 8100
851
+ },
852
+ {
853
+ "epoch": 1.5873015873015874,
854
+ "eval_loss": 0.42762240767478943,
855
+ "eval_runtime": 151.2432,
856
+ "eval_samples_per_second": 37.397,
857
+ "eval_steps_per_second": 4.675,
858
+ "eval_wer": 0.34180160806278187,
859
+ "step": 8200
860
+ },
861
+ {
862
+ "epoch": 1.6066589237320945,
863
+ "eval_loss": 0.42939648032188416,
864
+ "eval_runtime": 146.0228,
865
+ "eval_samples_per_second": 38.734,
866
+ "eval_steps_per_second": 4.842,
867
+ "eval_wer": 0.3399078814334548,
868
+ "step": 8300
869
+ },
870
+ {
871
+ "epoch": 1.6260162601626016,
872
+ "eval_loss": 0.42716294527053833,
873
+ "eval_runtime": 145.901,
874
+ "eval_samples_per_second": 38.766,
875
+ "eval_steps_per_second": 4.846,
876
+ "eval_wer": 0.3387523872189501,
877
+ "step": 8400
878
+ },
879
+ {
880
+ "epoch": 1.645373596593109,
881
+ "grad_norm": 1.037522792816162,
882
+ "learning_rate": 4.762105263157894e-05,
883
+ "loss": 0.3964,
884
+ "step": 8500
885
+ },
886
+ {
887
+ "epoch": 1.645373596593109,
888
+ "eval_loss": 0.4310940206050873,
889
+ "eval_runtime": 145.7507,
890
+ "eval_samples_per_second": 38.806,
891
+ "eval_steps_per_second": 4.851,
892
+ "eval_wer": 0.3408707932788753,
893
+ "step": 8500
894
+ },
895
+ {
896
+ "epoch": 1.664730933023616,
897
+ "eval_loss": 0.4260464608669281,
898
+ "eval_runtime": 146.3966,
899
+ "eval_samples_per_second": 38.635,
900
+ "eval_steps_per_second": 4.829,
901
+ "eval_wer": 0.3381264945194267,
902
+ "step": 8600
903
+ },
904
+ {
905
+ "epoch": 1.684088269454123,
906
+ "eval_loss": 0.4260489046573639,
907
+ "eval_runtime": 146.6331,
908
+ "eval_samples_per_second": 38.572,
909
+ "eval_steps_per_second": 4.822,
910
+ "eval_wer": 0.3370672914894641,
911
+ "step": 8700
912
+ },
913
+ {
914
+ "epoch": 1.7034456058846303,
915
+ "eval_loss": 0.4259546101093292,
916
+ "eval_runtime": 146.1762,
917
+ "eval_samples_per_second": 38.693,
918
+ "eval_steps_per_second": 4.837,
919
+ "eval_wer": 0.33636115613615575,
920
+ "step": 8800
921
+ },
922
+ {
923
+ "epoch": 1.7228029423151374,
924
+ "eval_loss": 0.42149877548217773,
925
+ "eval_runtime": 147.5316,
926
+ "eval_samples_per_second": 38.338,
927
+ "eval_steps_per_second": 4.792,
928
+ "eval_wer": 0.335109370737109,
929
+ "step": 8900
930
+ },
931
+ {
932
+ "epoch": 1.7421602787456445,
933
+ "grad_norm": 1.4968059062957764,
934
+ "learning_rate": 3.186315789473684e-05,
935
+ "loss": 0.3866,
936
+ "step": 9000
937
+ },
938
+ {
939
+ "epoch": 1.7421602787456445,
940
+ "eval_loss": 0.4234353303909302,
941
+ "eval_runtime": 146.8779,
942
+ "eval_samples_per_second": 38.508,
943
+ "eval_steps_per_second": 4.814,
944
+ "eval_wer": 0.3330391102694548,
945
+ "step": 9000
946
+ },
947
+ {
948
+ "epoch": 1.7615176151761518,
949
+ "eval_loss": 0.4210032522678375,
950
+ "eval_runtime": 146.0169,
951
+ "eval_samples_per_second": 38.735,
952
+ "eval_steps_per_second": 4.842,
953
+ "eval_wer": 0.3318515189934362,
954
+ "step": 9100
955
+ },
956
+ {
957
+ "epoch": 1.7808749516066589,
958
+ "eval_loss": 0.41560646891593933,
959
+ "eval_runtime": 145.9957,
960
+ "eval_samples_per_second": 38.741,
961
+ "eval_steps_per_second": 4.843,
962
+ "eval_wer": 0.3300540835486511,
963
+ "step": 9200
964
+ },
965
+ {
966
+ "epoch": 1.800232288037166,
967
+ "eval_loss": 0.41584905982017517,
968
+ "eval_runtime": 147.0182,
969
+ "eval_samples_per_second": 38.471,
970
+ "eval_steps_per_second": 4.809,
971
+ "eval_wer": 0.33032690857152025,
972
+ "step": 9300
973
+ },
974
+ {
975
+ "epoch": 1.8195896244676733,
976
+ "eval_loss": 0.41545388102531433,
977
+ "eval_runtime": 147.1819,
978
+ "eval_samples_per_second": 38.429,
979
+ "eval_steps_per_second": 4.804,
980
+ "eval_wer": 0.32944423937988476,
981
+ "step": 9400
982
+ },
983
+ {
984
+ "epoch": 1.8389469608981805,
985
+ "grad_norm": 0.9967782497406006,
986
+ "learning_rate": 1.6073684210526313e-05,
987
+ "loss": 0.37,
988
+ "step": 9500
989
+ },
990
+ {
991
+ "epoch": 1.8389469608981805,
992
+ "eval_loss": 0.41372692584991455,
993
+ "eval_runtime": 146.2893,
994
+ "eval_samples_per_second": 38.663,
995
+ "eval_steps_per_second": 4.833,
996
+ "eval_wer": 0.32921955994928664,
997
+ "step": 9500
998
+ },
999
+ {
1000
+ "epoch": 1.8583042973286876,
1001
+ "eval_loss": 0.4120025932788849,
1002
+ "eval_runtime": 146.1391,
1003
+ "eval_samples_per_second": 38.703,
1004
+ "eval_steps_per_second": 4.838,
1005
+ "eval_wer": 0.3284492304729502,
1006
+ "step": 9600
1007
+ },
1008
+ {
1009
+ "epoch": 1.8776616337591947,
1010
+ "eval_loss": 0.4108966886997223,
1011
+ "eval_runtime": 146.9334,
1012
+ "eval_samples_per_second": 38.494,
1013
+ "eval_steps_per_second": 4.812,
1014
+ "eval_wer": 0.3300701320794081,
1015
+ "step": 9700
1016
+ },
1017
+ {
1018
+ "epoch": 1.897018970189702,
1019
+ "eval_loss": 0.4100329577922821,
1020
+ "eval_runtime": 146.8452,
1021
+ "eval_samples_per_second": 38.517,
1022
+ "eval_steps_per_second": 4.815,
1023
+ "eval_wer": 0.32785543483494084,
1024
+ "step": 9800
1025
+ },
1026
+ {
1027
+ "epoch": 1.916376306620209,
1028
+ "eval_loss": 0.4094770848751068,
1029
+ "eval_runtime": 146.5252,
1030
+ "eval_samples_per_second": 38.601,
1031
+ "eval_steps_per_second": 4.825,
1032
+ "eval_wer": 0.3266999406204362,
1033
+ "step": 9900
1034
+ },
1035
+ {
1036
+ "epoch": 1.9357336430507162,
1037
+ "grad_norm": 0.7779282927513123,
1038
+ "learning_rate": 2.842105263157894e-07,
1039
+ "loss": 0.371,
1040
+ "step": 10000
1041
+ },
1042
+ {
1043
+ "epoch": 1.9357336430507162,
1044
+ "eval_loss": 0.409473717212677,
1045
+ "eval_runtime": 146.0247,
1046
+ "eval_samples_per_second": 38.733,
1047
+ "eval_steps_per_second": 4.842,
1048
+ "eval_wer": 0.3270690568278474,
1049
+ "step": 10000
1050
+ },
1051
+ {
1052
+ "epoch": 1.9357336430507162,
1053
+ "step": 10000,
1054
+ "total_flos": 1.1255918428180738e+19,
1055
+ "train_loss": 0.7365739318847656,
1056
+ "train_runtime": 19473.5524,
1057
+ "train_samples_per_second": 4.108,
1058
+ "train_steps_per_second": 0.514
1059
+ }
1060
+ ],
1061
+ "logging_steps": 500,
1062
+ "max_steps": 10000,
1063
+ "num_input_tokens_seen": 0,
1064
+ "num_train_epochs": 2,
1065
+ "save_steps": 400,
1066
+ "stateful_callbacks": {
1067
+ "TrainerControl": {
1068
+ "args": {
1069
+ "should_epoch_stop": false,
1070
+ "should_evaluate": false,
1071
+ "should_log": false,
1072
+ "should_save": true,
1073
+ "should_training_stop": true
1074
+ },
1075
+ "attributes": {}
1076
+ }
1077
+ },
1078
+ "total_flos": 1.1255918428180738e+19,
1079
+ "train_batch_size": 8,
1080
+ "trial_name": null,
1081
+ "trial_params": null
1082
+ }