pszemraj autoevaluator HF staff commited on
Commit
4e991f6
1 Parent(s): 62d306d

Add evaluation results on the default config of billsum (#8)

Browse files

- Add evaluation results on the default config of billsum (be12297939e7872a9f0cd9df739364af008f7364)


Co-authored-by: Evaluation Bot <[email protected]>

Files changed (1) hide show
  1. README.md +82 -1
README.md CHANGED
@@ -58,7 +58,55 @@ widget:
58
  \ and parameters 0, and generalization is influenced by the inductive bias of\
59
  \ this function space (Section 5)."
60
  example_title: scientific paper
61
- - text: "Is a else or outside the cob and tree written being of early client rope and you have is for good reasons. On to the ocean in Orange for time. By's the aggregate we can bed it yet. Why this please pick up on a sort is do and also M Getoi's nerocos and do rain become you to let so is his brother is made in use and Mjulia's's the lay major is aging Masastup coin present sea only of Oosii rooms set to you We do er do we easy this private oliiishs lonthen might be okay. Good afternoon everybody. Welcome to this lecture of Computational Statistics. As you can see, I'm not socially my name is Michael Zelinger. I'm one of the task for this class and you might have already seen me in the first lecture where I made a quick appearance. I'm also going to give the tortillas in the last third of this course. So to give you a little bit about me, I'm a old student here with better Bulman and my research centres on casual inference applied to biomedical disasters, so that could be genomics or that could be hospital data. If any of you is interested in writing a bachelor thesis, a semester paper may be mastathesis about this topic feel for reach out to me. you have my name on models and my email address you can find in the directory I'd Be very happy to talk about it. you do not need to be sure about it, we can just have a chat. So with that said, let's get on with the lecture. There's an exciting topic today I'm going to start by sharing some slides with you and later on during the lecture we'll move to the paper. So bear with me for a few seconds. Well, the projector is starting up. Okay, so let's get started. Today's topic is a very important one. It's about a technique which really forms one of the fundamentals of data science, machine learning, and any sort of modern statistics. It's called cross validation. I know you really want to understand this topic I Want you to understand this and frankly, nobody's gonna leave Professor Mineshousen's class without understanding cross validation. So to set the stage for this, I Want to introduce you to the validation problem in computational statistics. So the problem is the following: You trained a model on available data. You fitted your model, but you know the training data you got could always have been different and some data from the environment. Maybe it's a random process. You do not really know what it is, but you know that somebody else who gets a different batch of data from the same environment they would get slightly different training data and you do not care that your method performs as well. On this training data. you want to to perform well on other data that you have not seen other data from the same environment. So in other words, the validation problem is you want to quantify the performance of your model on data that you have not seen. So how is this even possible? How could you possibly measure the performance on data that you do not know The solution to? This is the following realization is that given that you have a bunch of data, you were in charge. You get to control how much that your model sees. It works in the following way: You can hide data firms model. Let's say you have a training data set which is a bunch of doubtless so X eyes are the features those are typically hide and national vector. It's got more than one dimension for sure. And the why why eyes. Those are the labels for supervised learning. As you've seen before, it's the same set up as we have in regression. And so you have this training data and now you choose that you only use some of those data to fit your model. You're not going to use everything, you only use some of it the other part you hide from your model. And then you can use this hidden data to do validation from the point of you of your model. This hidden data is complete by unseen. In other words, we solve our problem of validation."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62
  example_title: transcribed audio - lecture
63
  - text: "Transformer-based models have shown to be very useful for many NLP tasks.\
64
  \ However, a major limitation of transformers-based models is its O(n^2)O(n 2)\
@@ -267,6 +315,39 @@ model-index:
267
  type: gen_len
268
  value: 82.2177
269
  verified: true
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
270
  ---
271
 
272
  # long-t5-tglobal-base-16384 + BookSum
 
58
  \ and parameters 0, and generalization is influenced by the inductive bias of\
59
  \ this function space (Section 5)."
60
  example_title: scientific paper
61
+ - text: 'Is a else or outside the cob and tree written being of early client rope
62
+ and you have is for good reasons. On to the ocean in Orange for time. By''s the
63
+ aggregate we can bed it yet. Why this please pick up on a sort is do and also
64
+ M Getoi''s nerocos and do rain become you to let so is his brother is made in
65
+ use and Mjulia''s''s the lay major is aging Masastup coin present sea only of
66
+ Oosii rooms set to you We do er do we easy this private oliiishs lonthen might
67
+ be okay. Good afternoon everybody. Welcome to this lecture of Computational Statistics.
68
+ As you can see, I''m not socially my name is Michael Zelinger. I''m one of the
69
+ task for this class and you might have already seen me in the first lecture where
70
+ I made a quick appearance. I''m also going to give the tortillas in the last third
71
+ of this course. So to give you a little bit about me, I''m a old student here
72
+ with better Bulman and my research centres on casual inference applied to biomedical
73
+ disasters, so that could be genomics or that could be hospital data. If any of
74
+ you is interested in writing a bachelor thesis, a semester paper may be mastathesis
75
+ about this topic feel for reach out to me. you have my name on models and my email
76
+ address you can find in the directory I''d Be very happy to talk about it. you
77
+ do not need to be sure about it, we can just have a chat. So with that said, let''s
78
+ get on with the lecture. There''s an exciting topic today I''m going to start
79
+ by sharing some slides with you and later on during the lecture we''ll move to
80
+ the paper. So bear with me for a few seconds. Well, the projector is starting
81
+ up. Okay, so let''s get started. Today''s topic is a very important one. It''s
82
+ about a technique which really forms one of the fundamentals of data science,
83
+ machine learning, and any sort of modern statistics. It''s called cross validation.
84
+ I know you really want to understand this topic I Want you to understand this
85
+ and frankly, nobody''s gonna leave Professor Mineshousen''s class without understanding
86
+ cross validation. So to set the stage for this, I Want to introduce you to the
87
+ validation problem in computational statistics. So the problem is the following:
88
+ You trained a model on available data. You fitted your model, but you know the
89
+ training data you got could always have been different and some data from the
90
+ environment. Maybe it''s a random process. You do not really know what it is,
91
+ but you know that somebody else who gets a different batch of data from the same
92
+ environment they would get slightly different training data and you do not care
93
+ that your method performs as well. On this training data. you want to to perform
94
+ well on other data that you have not seen other data from the same environment.
95
+ So in other words, the validation problem is you want to quantify the performance
96
+ of your model on data that you have not seen. So how is this even possible? How
97
+ could you possibly measure the performance on data that you do not know The solution
98
+ to? This is the following realization is that given that you have a bunch of data,
99
+ you were in charge. You get to control how much that your model sees. It works
100
+ in the following way: You can hide data firms model. Let''s say you have a training
101
+ data set which is a bunch of doubtless so X eyes are the features those are typically
102
+ hide and national vector. It''s got more than one dimension for sure. And the
103
+ why why eyes. Those are the labels for supervised learning. As you''ve seen before,
104
+ it''s the same set up as we have in regression. And so you have this training
105
+ data and now you choose that you only use some of those data to fit your model.
106
+ You''re not going to use everything, you only use some of it the other part you
107
+ hide from your model. And then you can use this hidden data to do validation from
108
+ the point of you of your model. This hidden data is complete by unseen. In other
109
+ words, we solve our problem of validation.'
110
  example_title: transcribed audio - lecture
111
  - text: "Transformer-based models have shown to be very useful for many NLP tasks.\
112
  \ However, a major limitation of transformers-based models is its O(n^2)O(n 2)\
 
315
  type: gen_len
316
  value: 82.2177
317
  verified: true
318
+ - task:
319
+ type: summarization
320
+ name: Summarization
321
+ dataset:
322
+ name: billsum
323
+ type: billsum
324
+ config: default
325
+ split: test
326
+ metrics:
327
+ - name: ROUGE-1
328
+ type: rouge
329
+ value: 39.6378
330
+ verified: true
331
+ - name: ROUGE-2
332
+ type: rouge
333
+ value: 13.0017
334
+ verified: true
335
+ - name: ROUGE-L
336
+ type: rouge
337
+ value: 23.0255
338
+ verified: true
339
+ - name: ROUGE-LSUM
340
+ type: rouge
341
+ value: 32.9943
342
+ verified: true
343
+ - name: loss
344
+ type: loss
345
+ value: 1.9428048133850098
346
+ verified: true
347
+ - name: gen_len
348
+ type: gen_len
349
+ value: 162.3588
350
+ verified: true
351
  ---
352
 
353
  # long-t5-tglobal-base-16384 + BookSum