autoevaluator HF staff commited on
Commit
b6818aa
1 Parent(s): bd1c868

Add evaluation results on the samsum config and test split of samsum

Browse files

Beep boop, I am a bot from Hugging Face's automatic model evaluator 👋!\
Your model has been evaluated on the samsum config and test split of the [samsum](https://huggingface.co/datasets/samsum) dataset by

@TheAlphaQ

, using the predictions stored [here](https://huggingface.co/datasets/autoevaluate/autoeval-eval-samsum-samsum-6999f5-3301091732).\
Accept this pull request to see the results displayed on the [Hub leaderboard](https://huggingface.co/spaces/autoevaluate/leaderboards?dataset=samsum).\
Evaluate your model on more datasets [here](https://huggingface.co/spaces/autoevaluate/model-evaluator?dataset=samsum).

Files changed (1) hide show
  1. README.md +115 -80
README.md CHANGED
@@ -1,10 +1,10 @@
1
  ---
2
  language: en
 
3
  tags:
4
  - bart
5
  - seq2seq
6
  - summarization
7
- license: apache-2.0
8
  datasets:
9
  - cnndaily/newyorkdaily/xsum/samsum/dialogsum/AMI
10
  metrics:
@@ -190,42 +190,42 @@ model-index:
190
  - name: MEETING_SUMMARY
191
  results:
192
  - task:
193
- name: Abstractive Text Summarization
194
  type: abstractive-text-summarization
 
195
  dataset:
196
  name: samsum
197
  type: samsum
198
  metrics:
199
- - name: Validation ROGUE-1
200
- type: rouge-1
201
  value: 53.8795
202
- - name: Validation ROGUE-2
203
- type: rouge-2
204
  value: 28.4975
205
- - name: Validation ROGUE-L
206
- type: rouge-L
207
  value: 44.1899
208
- - name: Validation ROGUE-Lsum
209
- type: rouge-Lsum
210
  value: 49.4863
211
- - name: Validation ROGUE-Lsum
212
- type: gen-length
213
  value: 30.088
214
- - name: Test ROGUE-1
215
- type: rouge-1
216
  value: 53.2284
217
- - name: Test ROGUE-2
218
- type: rouge-2
219
  value: 28.184
220
- - name: Test ROGUE-L
221
- type: rouge-L
222
  value: 44.122
223
- - name: Test ROGUE-Lsum
224
- type: rouge-Lsum
225
  value: 49.0301
226
- - name: Test ROGUE-Lsum
227
- type: gen-length
228
  value: 29.9951
 
229
  - task:
230
  type: summarization
231
  name: Summarization
@@ -235,108 +235,143 @@ model-index:
235
  config: bazzhangz--sumdataset
236
  split: train
237
  metrics:
238
- - name: ROUGE-1
239
- type: rouge
240
  value: 40.5544
 
241
  verified: true
242
- - name: ROUGE-2
243
- type: rouge
244
  value: 17.0751
 
245
  verified: true
246
- - name: ROUGE-L
247
- type: rouge
248
  value: 32.153
 
249
  verified: true
250
- - name: ROUGE-LSUM
251
- type: rouge
252
  value: 36.4277
 
253
  verified: true
254
- - name: loss
255
- type: loss
256
  value: 2.116729736328125
 
257
  verified: true
258
- - name: gen_len
259
- type: gen_len
260
  value: 42.1978
 
261
  verified: true
262
- - name: MEETING_SUMMARY
263
- results:
264
  - task:
265
- name: Abstractive Text Summarization
266
  type: abstractive-text-summarization
 
267
  dataset:
268
  name: xsum
269
  type: xsum
270
  metrics:
271
- - name: Validation ROGUE-1
272
- type: rouge-1
273
  value: 35.9078
274
- - name: Validation ROGUE-2
275
- type: rouge-2
276
  value: 14.2497
277
- - name: Validation ROGUE-L
278
- type: rouge-L
279
  value: 28.1421
280
- - name: Validation ROGUE-Lsum
281
- type: rouge-Lsum
282
  value: 28.9826
283
- - name: Validation ROGUE-Lsum
284
- type: gen-length
285
  value: 32.0167
286
- - name: Test ROGUE-1
287
- type: rouge-1
288
  value: 36.0241
289
- - name: Test ROGUE-2
290
- type: rouge-2
291
  value: 14.3715
292
- - name: Test ROGUE-L
293
- type: rouge-L
294
  value: 28.1968
295
- - name: Test ROGUE-Lsum
296
- type: rouge-Lsum
297
  value: 29.0527
298
- - name: Test ROGUE-Lsum
299
- type: gen-length
300
  value: 31.9933
301
- - name: MEETING_SUMMARY
302
- results:
303
  - task:
304
- name: Abstractive Text Summarization
305
  type: abstractive-text-summarization
 
306
  dataset:
307
  name: dialogsum
308
  type: dialogsum
309
  metrics:
310
- - name: Validation ROGUE-1
311
- type: rouge-1
312
  value: 39.8612
313
- - name: Validation ROGUE-2
314
- type: rouge-2
315
  value: 16.6917
316
- - name: Validation ROGUE-L
317
- type: rouge-L
318
  value: 32.2718
319
- - name: Validation ROGUE-Lsum
320
- type: rouge-Lsum
321
  value: 35.8748
322
- - name: Validation ROGUE-Lsum
323
- type: gen-length
324
  value: 41.726
325
- - name: Test ROGUE-1
326
- type: rouge-1
327
  value: 36.9608
328
- - name: Test ROGUE-2
329
- type: rouge-2
330
  value: 14.3058
331
- - name: Test ROGUE-L
332
- type: rouge-L
333
  value: 29.3261
334
- - name: Test ROGUE-Lsum
335
- type: rouge-Lsum
336
  value: 32.9
337
- - name: Test ROGUE-Lsum
338
- type: gen-length
339
  value: 43.086
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
340
  ---
341
  Model obtained by Fine Tuning 'facebook/bart-large-xsum' using AMI Meeting Corpus, SAMSUM Dataset, DIALOGSUM Dataset, XSUM Dataset!
342
  ## Usage
 
1
  ---
2
  language: en
3
+ license: apache-2.0
4
  tags:
5
  - bart
6
  - seq2seq
7
  - summarization
 
8
  datasets:
9
  - cnndaily/newyorkdaily/xsum/samsum/dialogsum/AMI
10
  metrics:
 
190
  - name: MEETING_SUMMARY
191
  results:
192
  - task:
 
193
  type: abstractive-text-summarization
194
+ name: Abstractive Text Summarization
195
  dataset:
196
  name: samsum
197
  type: samsum
198
  metrics:
199
+ - type: rouge-1
 
200
  value: 53.8795
201
+ name: Validation ROGUE-1
202
+ - type: rouge-2
203
  value: 28.4975
204
+ name: Validation ROGUE-2
205
+ - type: rouge-L
206
  value: 44.1899
207
+ name: Validation ROGUE-L
208
+ - type: rouge-Lsum
209
  value: 49.4863
210
+ name: Validation ROGUE-Lsum
211
+ - type: gen-length
212
  value: 30.088
213
+ name: Validation ROGUE-Lsum
214
+ - type: rouge-1
215
  value: 53.2284
216
+ name: Test ROGUE-1
217
+ - type: rouge-2
218
  value: 28.184
219
+ name: Test ROGUE-2
220
+ - type: rouge-L
221
  value: 44.122
222
+ name: Test ROGUE-L
223
+ - type: rouge-Lsum
224
  value: 49.0301
225
+ name: Test ROGUE-Lsum
226
+ - type: gen-length
227
  value: 29.9951
228
+ name: Test ROGUE-Lsum
229
  - task:
230
  type: summarization
231
  name: Summarization
 
235
  config: bazzhangz--sumdataset
236
  split: train
237
  metrics:
238
+ - type: rouge
 
239
  value: 40.5544
240
+ name: ROUGE-1
241
  verified: true
242
+ - type: rouge
 
243
  value: 17.0751
244
+ name: ROUGE-2
245
  verified: true
246
+ - type: rouge
 
247
  value: 32.153
248
+ name: ROUGE-L
249
  verified: true
250
+ - type: rouge
 
251
  value: 36.4277
252
+ name: ROUGE-LSUM
253
  verified: true
254
+ - type: loss
 
255
  value: 2.116729736328125
256
+ name: loss
257
  verified: true
258
+ - type: gen_len
 
259
  value: 42.1978
260
+ name: gen_len
261
  verified: true
 
 
262
  - task:
 
263
  type: abstractive-text-summarization
264
+ name: Abstractive Text Summarization
265
  dataset:
266
  name: xsum
267
  type: xsum
268
  metrics:
269
+ - type: rouge-1
 
270
  value: 35.9078
271
+ name: Validation ROGUE-1
272
+ - type: rouge-2
273
  value: 14.2497
274
+ name: Validation ROGUE-2
275
+ - type: rouge-L
276
  value: 28.1421
277
+ name: Validation ROGUE-L
278
+ - type: rouge-Lsum
279
  value: 28.9826
280
+ name: Validation ROGUE-Lsum
281
+ - type: gen-length
282
  value: 32.0167
283
+ name: Validation ROGUE-Lsum
284
+ - type: rouge-1
285
  value: 36.0241
286
+ name: Test ROGUE-1
287
+ - type: rouge-2
288
  value: 14.3715
289
+ name: Test ROGUE-2
290
+ - type: rouge-L
291
  value: 28.1968
292
+ name: Test ROGUE-L
293
+ - type: rouge-Lsum
294
  value: 29.0527
295
+ name: Test ROGUE-Lsum
296
+ - type: gen-length
297
  value: 31.9933
298
+ name: Test ROGUE-Lsum
 
299
  - task:
 
300
  type: abstractive-text-summarization
301
+ name: Abstractive Text Summarization
302
  dataset:
303
  name: dialogsum
304
  type: dialogsum
305
  metrics:
306
+ - type: rouge-1
 
307
  value: 39.8612
308
+ name: Validation ROGUE-1
309
+ - type: rouge-2
310
  value: 16.6917
311
+ name: Validation ROGUE-2
312
+ - type: rouge-L
313
  value: 32.2718
314
+ name: Validation ROGUE-L
315
+ - type: rouge-Lsum
316
  value: 35.8748
317
+ name: Validation ROGUE-Lsum
318
+ - type: gen-length
319
  value: 41.726
320
+ name: Validation ROGUE-Lsum
321
+ - type: rouge-1
322
  value: 36.9608
323
+ name: Test ROGUE-1
324
+ - type: rouge-2
325
  value: 14.3058
326
+ name: Test ROGUE-2
327
+ - type: rouge-L
328
  value: 29.3261
329
+ name: Test ROGUE-L
330
+ - type: rouge-Lsum
331
  value: 32.9
332
+ name: Test ROGUE-Lsum
333
+ - type: gen-length
334
  value: 43.086
335
+ name: Test ROGUE-Lsum
336
+ - task:
337
+ type: summarization
338
+ name: Summarization
339
+ dataset:
340
+ name: samsum
341
+ type: samsum
342
+ config: samsum
343
+ split: test
344
+ metrics:
345
+ - type: rouge
346
+ value: 53.1878
347
+ name: ROUGE-1
348
+ verified: true
349
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTVkNTczYjFmYzBmMzczNWE0MGY4MDAyZWExOGNjZmY1Yzk2ZGM1MGNjZmFmYWUyZmIxZjdjOTk4OTc4OGJlMSIsInZlcnNpb24iOjF9.yyzPpGtESuZXy_lBESrboGxdGYB7I6jaIjquCYqliE2xdbGf5awDFpDUwlZHDuw6RD2mIZv1FC8PPs9lOHuSAg
350
+ - type: rouge
351
+ value: 28.1666
352
+ name: ROUGE-2
353
+ verified: true
354
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMjAzOTdjNGYxNWMzYmFjYjRmMTcxYzI0MmNlNmM5Nzg2MzBlNDdmZWFkN2EwMDE2ZTZmYzc0Zjg0ZDc0M2IxNiIsInZlcnNpb24iOjF9.cPH6O50T6HekO227Xzha-EN_Jp7JS9fh5EP9I0tHxbpGptKtZOQC-NG68zfU2eJKlRSrmgaBYs8tjfTvpAgyDg
355
+ - type: rouge
356
+ value: 44.117
357
+ name: ROUGE-L
358
+ verified: true
359
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNmNmMzJkYjMxMjhlZDM4YmU3NmI1MDExNzhiYmVhMzEyZGJjNDJkNzczNGQwOTMwNzg2YjU1ZWQ4MDhiMzkxYiIsInZlcnNpb24iOjF9.lcEXK15UqZOdXnPjVqIhFd6o_PLROSIONTRFX5NbwanjEI_MWMLpDh_V0Kpnvs_W0sE6cXh2yoifSYNDA5W7Bw
360
+ - type: rouge
361
+ value: 49.0094
362
+ name: ROUGE-LSUM
363
+ verified: true
364
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYThkYjk4ZjMzYjI0OTAxNDJiZTU5MzE0YjI5MjEzYTYwNWEzMmU5NjU2ZjQ5NzJhMzkyNmVhNWFjZmM1MjAwMSIsInZlcnNpb24iOjF9.LTn6LpKuMO4Rv4NgsbPmtr2ewiKyoqAXlf6YJfM_6GKwVTKpnJxwx7gaaAtMb0jVlgieITMP11JmbeRfMEhgDg
365
+ - type: loss
366
+ value: 1.710614562034607
367
+ name: loss
368
+ verified: true
369
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNjNjZmM0ZjkwYWYyMWIyMmFiMWI1ODBiYjRjNzVhM2JhN2NmNmM1ZDUwZWRjNDQxNzUwMWM4YjYxYTg1MWYwNyIsInZlcnNpb24iOjF9.hGXZhp9pe-HDJilXVvMCkqz-92YZvH6Qr7q9Z7fJkm8N9s0b4sl-4PwjQYJEOLEAhoRO2s-F5T3bmCYCaMiNBQ
370
+ - type: gen_len
371
+ value: 29.9951
372
+ name: gen_len
373
+ verified: true
374
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmY1NzZiMDAzNGJlNTg4Nzc0YzU1MTA3YTI3MzVmNGZkNWQ0ZDE4MGZlNGI1MzJmYzA3MjQ0MDZhMTcyYTk2NCIsInZlcnNpb24iOjF9.8dvMfY7Y-nw-K8NGgTXIGFMxaSUWQYBE1w3N5YYOn4iwnCe2ugo2qPIOxLY91q7CaAOMCSskFV3BDStQ4p0ZCg
375
  ---
376
  Model obtained by Fine Tuning 'facebook/bart-large-xsum' using AMI Meeting Corpus, SAMSUM Dataset, DIALOGSUM Dataset, XSUM Dataset!
377
  ## Usage