Commit
•
bd1c868
1
Parent(s):
1d9f226
Add evaluation results on the bazzhangz--sumdataset config and train split of bazzhangz/sumdataset (#3)
Browse files- Add evaluation results on the bazzhangz--sumdataset config and train split of bazzhangz/sumdataset (fabbffcffff85f560eebdf92b190ac6b208c7d15)
Co-authored-by: Evaluation Bot <[email protected]>
README.md
CHANGED
@@ -10,19 +10,190 @@ datasets:
|
|
10 |
metrics:
|
11 |
- rouge
|
12 |
widget:
|
13 |
-
- text:
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
model-index:
|
19 |
- name: MEETING_SUMMARY
|
20 |
results:
|
21 |
-
- task:
|
22 |
name: Abstractive Text Summarization
|
23 |
type: abstractive-text-summarization
|
24 |
dataset:
|
25 |
-
name:
|
26 |
type: samsum
|
27 |
metrics:
|
28 |
- name: Validation ROGUE-1
|
@@ -55,13 +226,46 @@ model-index:
|
|
55 |
- name: Test ROGUE-Lsum
|
56 |
type: gen-length
|
57 |
value: 29.9951
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
58 |
- name: MEETING_SUMMARY
|
59 |
results:
|
60 |
-
- task:
|
61 |
name: Abstractive Text Summarization
|
62 |
type: abstractive-text-summarization
|
63 |
dataset:
|
64 |
-
name:
|
65 |
type: xsum
|
66 |
metrics:
|
67 |
- name: Validation ROGUE-1
|
@@ -96,11 +300,11 @@ model-index:
|
|
96 |
value: 31.9933
|
97 |
- name: MEETING_SUMMARY
|
98 |
results:
|
99 |
-
- task:
|
100 |
name: Abstractive Text Summarization
|
101 |
type: abstractive-text-summarization
|
102 |
dataset:
|
103 |
-
name:
|
104 |
type: dialogsum
|
105 |
metrics:
|
106 |
- name: Validation ROGUE-1
|
|
|
10 |
metrics:
|
11 |
- rouge
|
12 |
widget:
|
13 |
+
- text: 'Hi, I''m David and I''m supposed to be an industrial designer. Um, I just
|
14 |
+
got the project announcement about what the project is. Designing a remote control.
|
15 |
+
That''s about it, didn''t get anything else. Did you get the same thing? Cool.
|
16 |
+
There''s too much gear. Okay. Can''t draw. Um. Yeah. Um, well anyway, I don''t
|
17 |
+
know, it''s just the first animal I can think off the top of my head. Um. Yes.
|
18 |
+
Big reason is ''cause I''m allergic to most animals. Allergic to animal fur, so
|
19 |
+
um fish was a natural choice. Um, yeah, and I kind of like whales. They come in
|
20 |
+
and go eat everything in sight. And they''re quite harmless and mild and interesting.
|
21 |
+
Tail''s a bit big, I think. It''s an after dinner dog then. Hmm. It does make
|
22 |
+
sense from maybe the design point of view ''cause you have more complicated characters
|
23 |
+
like European languages, then you need more buttons. So, possibly. Hmm. Yeah.
|
24 |
+
And you keep losing them. Finding them is really a pain, you know. I mean it''s
|
25 |
+
usually quite small, or when you want it right, it slipped behind the couch or
|
26 |
+
it''s kicked under the table. You know. Yep. Mm-hmm. I think one factor would
|
27 |
+
be production cost. Because there''s a cap there, so um depends on how much you
|
28 |
+
can cram into that price. Um. I think that that''s the main factor. Cool.
|
29 |
+
|
30 |
+
Okay. Right. Um well this is the kick-off meeting for our our project. Um and
|
31 |
+
um this is just what we''re gonna be doing over the next twenty five minutes.
|
32 |
+
Um so first of all, just to kind of make sure that we all know each other, I''m
|
33 |
+
Laura and I''m the project manager. Do you want to introduce yourself again? Okay.
|
34 |
+
Great. Okay. Um so we''re designing a new remote control and um Oh I have to record
|
35 |
+
who''s here actually. So that''s David, Andrew and Craig, isn''t it? And you all
|
36 |
+
arrived on time. Um yeah so des uh design a new remote control. Um, as you can
|
37 |
+
see it''s supposed to be original, trendy and user friendly. Um so that''s kind
|
38 |
+
of our our brief, as it were. Um and so there are three different stages to the
|
39 |
+
design. Um I''m not really sure what what you guys have already received um in
|
40 |
+
your emails. What did you get? Mm-hmm. Is that what everybody got? Okay. Um. So
|
41 |
+
we''re gonna have like individual work and then a meeting about it. And repeat
|
42 |
+
that process three times. Um and at this point we get try out the whiteboard over
|
43 |
+
there. Um. So uh you get to draw your favourite animal and sum up your favourite
|
44 |
+
characteristics of it. So who would like to go first? Very good. Mm-hmm. Yeah.
|
45 |
+
Yeah. Right. Lovely. Right. You can take as long over this as you like, because
|
46 |
+
we haven''t got an awful lot to discuss. Ok oh we do we do. Don''t feel like you''re
|
47 |
+
in a rush, anyway. Ach why not We might have to get you up again then. I don''t
|
48 |
+
know what mine is. I''m gonna have to think on the spot now. Is that a whale?
|
49 |
+
Ah. Okay. God, I still don''t know what I''m gonna write about. Um. I was gonna
|
50 |
+
choose a dog as well. But I''ll just draw a different kind of dog. M my favourite
|
51 |
+
animal is my own dog at home. Um That doesn''t really look like him, actually.
|
52 |
+
He looks more like a pig, actually. Ah well. Do you? Oh that''s very good of you.
|
53 |
+
Uh. Um he''s a mixture of uh various things. Um and what do I like about him,
|
54 |
+
um That''s just to suggest that his tail wags. Um he''s very friendly and cheery
|
55 |
+
and always pleased to see you, and very kind of affectionate and um uh and he''s
|
56 |
+
quite quite wee as well so you know he can doesn''t take up too much space. Um
|
57 |
+
and uh And he does a funny thing where he chases his tail as well, which is quite
|
58 |
+
amusing, so It is. I think it is. He only does it after he''s had his dinner and
|
59 |
+
um he''ll just all of a sudden just get up and start chasing his tail ''round
|
60 |
+
the living room. Yeah, so uh Yeah, maybe. Maybe. Right, um where did you find
|
61 |
+
this? Just down here? Yeah. Okay. Um what are we doing next? Uh um. Okay, uh we
|
62 |
+
now need to discuss the project finance. Um so according to the brief um we''re
|
63 |
+
gonna be selling this remote control for twenty five Euro, um and we''re aiming
|
64 |
+
to make fifty million Euro. Um so we''re gonna be selling this on an international
|
65 |
+
scale. And uh we don''t want it to cost any more than uh twelve fifty Euros, so
|
66 |
+
fifty percent of the selling price. Sure. All together. Um I dunno. I imagine
|
67 |
+
That''s a good question. I imagine it probably is our sale actually because it''s
|
68 |
+
probably up to the the um the retailer to uh sell it for whatever price they want.
|
69 |
+
Um. But I I don''t know, I mean do you think the fact that it''s going to be sold
|
70 |
+
internationally will have a bearing on how we design it at all? Think it will?
|
71 |
+
Um. Hmm. Oh yeah, regions and stuff, yeah. Yeah. Okay. Yeah. Well for a remote
|
72 |
+
control, do you think that will be I suppose it''s depends on how complicated
|
73 |
+
our remote control is. Yeah, yeah. Okay. What, just like in terms of like the
|
74 |
+
wealth of the country? Like how much money people have to spend on things like?
|
75 |
+
Aye, I see what you mean, yeah. Marketing. Good marketing thoughts. Oh gosh, I
|
76 |
+
should be writing all this down. Um. Mm. Yeah. Yeah, yeah. Like how much does,
|
77 |
+
you know, a remote control cost. Well twenty five Euro, I mean that''s um that''s
|
78 |
+
about like eighteen pounds or something, isn''t it? Or no, is it as much as that?
|
79 |
+
Sixteen seventeen eighteen pounds. Um, I dunno, I''ve never bought a remote control,
|
80 |
+
so I don''t know how how good a remote control that would get you. Um. But yeah,
|
81 |
+
I suppose it has to look kind of cool and gimmicky. Um right, okay. Let me just
|
82 |
+
scoot on ahead here. Okay. Um well d Does anybody have anything to add to uh to
|
83 |
+
the finance issue at all? Thin No, actually. That would be useful, though, wouldn''t
|
84 |
+
it, if you knew like what your money would get you now. Mm-hmm. Yeah, yeah. Oh.
|
85 |
+
Five minutes to end of meeting. Oh, okay. We''re a bit behind. Yeah. Right, so
|
86 |
+
do you think that should be like a main design aim of our remote control d you
|
87 |
+
know, do your your satellite and your regular telly and your V_C_R_ and everything?
|
88 |
+
Mm-hmm. Yeah. Or even like, you know, notes about um what you wanna watch. Like
|
89 |
+
you might put in there oh I want to watch such and such and look a Oh that''s
|
90 |
+
a good idea. So extra functionalities. Mm-hmm. Hmm. Um okay, uh I''d wel we''re
|
91 |
+
gonna have to wrap up pretty quickly in the next couple of minutes. Um I''ll just
|
92 |
+
check we''ve nothing else. Okay. Um so anything else anybody wants to add about
|
93 |
+
what they don''t like about remote controls they''ve used, what they would really
|
94 |
+
like to be part of this new one at all? You keep losing them. Okay. Yeah. W You
|
95 |
+
get those ones where you can, if you like, whistle or make a really high pitched
|
96 |
+
noise they beep. There I mean is that something we''d want to include, do you
|
97 |
+
think? Dunno. Okay maybe. My goodness. Still feels quite primitive. Maybe like
|
98 |
+
a touch screen or something? Okay. Uh-huh, okay. Well I guess that''s up to our
|
99 |
+
industrial designer. It looks better. Yeah. Okay. Okay. Right, well um so just
|
100 |
+
to wrap up, the next meeting''s gonna be in thirty minutes. So that''s about um
|
101 |
+
about ten to twelve by my watch. Um so inbetween now and then, um as the industrial
|
102 |
+
designer, you''re gonna be working on you know the actual working design of it
|
103 |
+
so y you know what you''re doing there. Um for user interface, technical functions,
|
104 |
+
I guess that''s you know like what we''ve been talking about, what it''ll actually
|
105 |
+
do. Um and uh marketing executive, you''ll be just thinking about what it actually
|
106 |
+
what, you know, what requirements it has to has to fulfil and you''ll all get
|
107 |
+
instructions emailed to you, I guess. Um. Yeah, so it''s th the functional design
|
108 |
+
stage is next, I guess. And uh and that''s the end of the meeting. So I got that
|
109 |
+
little message a lot sooner than I thought I would, so Mm-hmm. Uh-huh, yeah. Th
|
110 |
+
Okay, well just very quickly ''cause this we''re supposed to finish now. Um I
|
111 |
+
guess that''s up to us, I mean you probably want some kind of unique selling point
|
112 |
+
of it, so um, you know Yeah. Mm-hmm. Yeah. Okay. Right, okay, we''ll that''s that''s
|
113 |
+
the end of the meeting, then. Um. So, uh thank you all for coming.
|
114 |
+
|
115 |
+
Um I''m Craig and I''m User Interface. Yeah. Well, my favourite animal would be
|
116 |
+
a monkey. Then they''re small cute and furry, and uh when planet of the apes becomes
|
117 |
+
real, I''m gonna be up there with them. Yeah. I know um My parents went out and
|
118 |
+
bought um remote controls because um they got fed up of having four or five different
|
119 |
+
remote controls for each things the house. So um for them it was just how many
|
120 |
+
devices control. Uh.
|
121 |
+
|
122 |
+
Mm-hmm. Great. And I''m Andrew and I''m uh our marketing expert. Mm-hmm. Mm-hmm.
|
123 |
+
Yeah, that''s that''s it. Yeah. I will go. That''s fine. Alright. So This one
|
124 |
+
here, right? Okay. Very nice. Alright. My favourite animal is like A beagle. Um
|
125 |
+
charac favourite characteristics of it? Is that right? Uh, right, well basically
|
126 |
+
um high priority for any animal for me is that they be willing to take a lot of
|
127 |
+
physical affection from their family. And, yeah that they have lots of personality
|
128 |
+
and uh be fit and in robust good health. So this is blue. Blue beagle. My family''s
|
129 |
+
beagle. I coulda told you a whole lot more about beagles. Boy, let me tell you.
|
130 |
+
Impressionist. Alright. Mm. Superb sketch, by the way. Yep. I see a dog in there.
|
131 |
+
Yep. Now I see a rooster. What kind is it? Is he aware that th it''s his own cha
|
132 |
+
tail he''s chasing? Hmm. Probably when he was little he got lots of attention
|
133 |
+
for doing it and has forever been conditioned. ''Kay. Um, can we just go over
|
134 |
+
that again? Uh, so bas at twel Alright, yeah. Okay. So cost like production cost
|
135 |
+
is twelve fifty, but selling price is is that wholesale or retail? Like on the
|
136 |
+
shelf. Our sale our sale anyway. Yeah, okay okay. Okay. Mm-hmm. Alright. Yes.
|
137 |
+
Mm-hmm. Mm-hmm. Well right away I''m wondering if there''s um th th uh, like with
|
138 |
+
D_V_D_ players, if there are zones. Um f frequencies or something um as well as
|
139 |
+
uh characters, um different uh keypad styles and s symbols. Um. I don''t know.
|
140 |
+
Yeah. Yeah. Yeah. And then a and then al the other thing international is on top
|
141 |
+
of the price. I''m thinking the price might might appeal to a certain market in
|
142 |
+
one region, whereas in another it''ll be different, so Just a chara just a characteristic
|
143 |
+
of the Just Or just like, basic product podi positioning, the twenty five Euro
|
144 |
+
remote control might be a big hit in London, might not be such a big hit in Greece,
|
145 |
+
who knows, something like that, yeah. Yep. Right away I''m making some kind of
|
146 |
+
assumptions about what what information we''re given here, thinking, ''kay trendy
|
147 |
+
probably means something other than just basic, something other than just standard.
|
148 |
+
Um so I''m wondering right away, is selling twenty five Euros, is that sort of
|
149 |
+
the thi is this gonna to be like the premium product kinda thing or Uh-huh. Mm-hmm.
|
150 |
+
Yep. Yeah, I''d say so, yeah. No. Yeah, yeah. Mm-hmm. Do we have any other background
|
151 |
+
information on like how that compares to other other Yeah. Mm-hmm. Yeah, interesting
|
152 |
+
thing about discussing um production of a remote control for me is that l as you
|
153 |
+
point out, I just don''t think of remote controls as somethin something people
|
154 |
+
consciously assess in their purchasing habits. It''s just like getting shoelaces
|
155 |
+
with shoes or something. It just comes along. Do you know what I mean? Like so
|
156 |
+
sort of like how do you I I mean one one way of looking at it would be, well the
|
157 |
+
people producing television sets, maybe they have to buy remote controls. Or another
|
158 |
+
way is maybe people who have T_V_ sets are really fed up with their remote control
|
159 |
+
and they really want a better one or something. But Right. Right. Okay so Right,
|
160 |
+
so in function one of the priorities might be to combine as many uses I think
|
161 |
+
so. Yeah, yeah. Yeah. Well like um, maybe what we could use is a sort of like
|
162 |
+
a example of a successful other piece technology is palm palm pilots. They''re
|
163 |
+
gone from being just like little sort of scribble boards to cameras, M_P_ three
|
164 |
+
players, telephones, everything, agenda. So, like, I wonder if we might add something
|
165 |
+
new to the to the remote control market, such as the lighting in your house, or
|
166 |
+
um Yeah, yeah. An Yeah. Like, p personally for me, at home I''ve I''ve combined
|
167 |
+
the um the audio video of my television set and my D_V_D_ player and my C_D_ player.
|
168 |
+
So they w all work actually function together but I have different remote controls
|
169 |
+
for each of them. So it''s sort of ironic that that then they''re in there um
|
170 |
+
you know, the sound and everything it''s just one system. But each one''s got
|
171 |
+
its own little part. Mm. Mm. Mm. Mm-hmm. Mm-hmm. Yeah. Yeah. That''s just really
|
172 |
+
good id Yep. Uh, sure. I remember when the first remote control my my family had
|
173 |
+
was on a cable. Actually had a cable between it and the T_V_ and big like buttons
|
174 |
+
that sort of like, like on a blender or something. And um, you know, when I think
|
175 |
+
about what they are now, it''s better, but actually it''s still kind of, I dunno,
|
176 |
+
like a massive junky thing on the table. Maybe we could think about how, could
|
177 |
+
be more, you know, streamlined. S Something like that, yeah. Or whatever would
|
178 |
+
be technologically reasonable. ''Cause it could b it could it could be that f
|
179 |
+
it could be that functionally that doesn''t make it any better, but that just
|
180 |
+
the appeal of of not having You know, these days there''s a r pe things in people''s
|
181 |
+
homes are becoming more and more like chic, you know. Um, nicer materials and
|
182 |
+
might be be worth exploring anyway. Okay. Um. Before we wrap up, just to make
|
183 |
+
sure we''re all on the same page here, um, do we We were given sort of an example
|
184 |
+
of a coffee machine or something, right? Well, um are we at ma right now on the
|
185 |
+
assumption that our television remote control may have features which go beyond
|
186 |
+
the television? Or are we keeping sort of like a a design commitment to television
|
187 |
+
features? I I don''t know. Yep. Yeah, sure. Okay. Okay, yeah. Okay. Okay. Okay.
|
188 |
+
Alright.'
|
189 |
model-index:
|
190 |
- name: MEETING_SUMMARY
|
191 |
results:
|
192 |
+
- task:
|
193 |
name: Abstractive Text Summarization
|
194 |
type: abstractive-text-summarization
|
195 |
dataset:
|
196 |
+
name: samsum
|
197 |
type: samsum
|
198 |
metrics:
|
199 |
- name: Validation ROGUE-1
|
|
|
226 |
- name: Test ROGUE-Lsum
|
227 |
type: gen-length
|
228 |
value: 29.9951
|
229 |
+
- task:
|
230 |
+
type: summarization
|
231 |
+
name: Summarization
|
232 |
+
dataset:
|
233 |
+
name: bazzhangz/sumdataset
|
234 |
+
type: bazzhangz/sumdataset
|
235 |
+
config: bazzhangz--sumdataset
|
236 |
+
split: train
|
237 |
+
metrics:
|
238 |
+
- name: ROUGE-1
|
239 |
+
type: rouge
|
240 |
+
value: 40.5544
|
241 |
+
verified: true
|
242 |
+
- name: ROUGE-2
|
243 |
+
type: rouge
|
244 |
+
value: 17.0751
|
245 |
+
verified: true
|
246 |
+
- name: ROUGE-L
|
247 |
+
type: rouge
|
248 |
+
value: 32.153
|
249 |
+
verified: true
|
250 |
+
- name: ROUGE-LSUM
|
251 |
+
type: rouge
|
252 |
+
value: 36.4277
|
253 |
+
verified: true
|
254 |
+
- name: loss
|
255 |
+
type: loss
|
256 |
+
value: 2.116729736328125
|
257 |
+
verified: true
|
258 |
+
- name: gen_len
|
259 |
+
type: gen_len
|
260 |
+
value: 42.1978
|
261 |
+
verified: true
|
262 |
- name: MEETING_SUMMARY
|
263 |
results:
|
264 |
+
- task:
|
265 |
name: Abstractive Text Summarization
|
266 |
type: abstractive-text-summarization
|
267 |
dataset:
|
268 |
+
name: xsum
|
269 |
type: xsum
|
270 |
metrics:
|
271 |
- name: Validation ROGUE-1
|
|
|
300 |
value: 31.9933
|
301 |
- name: MEETING_SUMMARY
|
302 |
results:
|
303 |
+
- task:
|
304 |
name: Abstractive Text Summarization
|
305 |
type: abstractive-text-summarization
|
306 |
dataset:
|
307 |
+
name: dialogsum
|
308 |
type: dialogsum
|
309 |
metrics:
|
310 |
- name: Validation ROGUE-1
|