File size: 2,922 Bytes
d31084e
 
 
 
9fd3bff
d31084e
 
 
 
 
 
 
 
 
 
 
 
 
 
4b1312f
d31084e
 
 
4b1312f
d31084e
 
 
4b1312f
d31084e
 
 
4b1312f
d31084e
 
 
83716b7
d31084e
 
 
83716b7
c260988
b7cf562
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2775ced
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59d4418
553d60b
59d4418
553d60b
f0ee7a8
553d60b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d68deeb
553d60b
8b0d86f
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
---
language: en
tags:
- summarization
license: mit
model-index:
- name: SamuelAllen123/t5-efficient-large-nl36_fine_tune_sum_V2
  results:
  - task:
      type: summarization
      name: Summarization
    dataset:
      name: samsum
      type: samsum
      config: samsum
      split: test
    metrics:
    - name: ROUGE-1
      type: rouge
      value: 50.5049
      verified: true
    - name: ROUGE-2
      type: rouge
      value: 25.6469
      verified: true
    - name: ROUGE-L
      type: rouge
      value: 41.7544
      verified: true
    - name: ROUGE-LSUM
      type: rouge
      value: 46.2055
      verified: true
    - name: loss
      type: loss
      value: 1.5158178806304932
      verified: true
    - name: gen_len
      type: gen_len
      value: 24.0342
      verified: true
  - task:
      type: summarization
      name: Summarization
    dataset:
      name: cnn_dailymail
      type: cnn_dailymail
      config: 3.0.0
      split: test
    metrics:
    - name: ROUGE-1
      type: rouge
      value: 34.4055
      verified: true
    - name: ROUGE-2
      type: rouge
      value: 14.127
      verified: true
    - name: ROUGE-L
      type: rouge
      value: 24.3353
      verified: true
    - name: ROUGE-LSUM
      type: rouge
      value: 31.6582
      verified: true
    - name: loss
      type: loss
      value: 2.4456119537353516
      verified: true
    - name: gen_len
      type: gen_len
      value: 45.928
      verified: true
  - task:
      type: summarization
      name: Summarization
    dataset:
      name: samsum
      type: samsum
      config: samsum
      split: train
    metrics:
    - name: ROUGE-1
      type: rouge
      value: 54.933
      verified: true
    - name: ROUGE-2
      type: rouge
      value: 31.7965
      verified: true
    - name: ROUGE-L
      type: rouge
      value: 47.0057
      verified: true
    - name: ROUGE-LSUM
      type: rouge
      value: 51.2027
      verified: true
    - name: loss
      type: loss
      value: 1.130684494972229
      verified: true
    - name: gen_len
      type: gen_len
      value: 23.7989
      verified: true
---
Trained on Samsum train split. 

Parameters for training:

no_decay = ["bias", "LayerNorm.weight", "layer_norm.weight"]
optimizer_grouped_parameters = [
    {
        "params": [p for n, p in model.named_parameters() if not any(nd in n for nd in no_decay)],
        "weight_decay": 0.0,
    },
    {
        "params": [p for n, p in model.named_parameters() if any(nd in n for nd in no_decay)],
        "weight_decay": 0.0,
    },
]

lr = 0.00005
optimizer = torch.optim.RAdam(optimizer_grouped_parameters, lr=lr)

lr_scheduler = get_scheduler(
        name="linear",
        optimizer=optimizer,
        num_warmup_steps=0,
        num_training_steps=50005)

This was only for 10K steps with a batch size of 10

If you want more info, feel free to message me or email me at:
[email protected]