File size: 15,510 Bytes
93f7281
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
license: apache-2.0
base_model: google/flan-t5-large
tags:
- generated_from_trainer
metrics:
- bleu
model-index:
- name: result
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# result

This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.5606
- Squad: {'exact_match': 31.547619047619047, 'f1': 65.97520968920112}
- Bleu: {'bleu': 0.4478359370566898, 'precisions': [0.4970939125114714, 0.45436955820703, 0.43196470987444857, 0.4122681883024251], 'brevity_penalty': 1.0, 'length_ratio': 1.3752629364745477, 'translation_length': 3269, 'reference_length': 2377}

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step | Validation Loss | Squad                                                         | Bleu                                                                                                                                                                                                                                                         |
|:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
| 0.3089        | 0.14  | 100  | 0.4816          | {'exact_match': 35.714285714285715, 'f1': 62.521389029334365} | {'bleu': 0.3454328094223805, 'precisions': [0.4022636892015907, 0.35407932924862945, 0.327451645741432, 0.30527817403708984], 'brevity_penalty': 1.0, 'length_ratio': 1.7017178552837064, 'translation_length': 3269, 'reference_length': 1921}              |
| 0.4283        | 0.27  | 200  | 0.4634          | {'exact_match': 19.047619047619047, 'f1': 58.11077854533359}  | {'bleu': 0.34061189083646615, 'precisions': [0.598654022636892, 0.5469203482747501, 0.5269765863590091, 0.510342368045649], 'brevity_penalty': 0.6252757307345338, 'length_ratio': 0.6804746044962531, 'translation_length': 3269, 'reference_length': 4804} |
| 0.4923        | 0.41  | 300  | 0.4645          | {'exact_match': 33.92857142857143, 'f1': 63.34981920724316}   | {'bleu': 0.39658908018568106, 'precisions': [0.4472315692872438, 0.4040632054176072, 0.38004750593824227, 0.3601997146932953], 'brevity_penalty': 1.0, 'length_ratio': 1.5693710993759002, 'translation_length': 3269, 'reference_length': 2083}             |
| 0.5249        | 0.55  | 400  | 0.5079          | {'exact_match': 33.92857142857143, 'f1': 62.939136694434445}  | {'bleu': 0.31207996448397607, 'precisions': [0.365861119608443, 0.3227990970654628, 0.29555480149304375, 0.2717546362339515], 'brevity_penalty': 1.0, 'length_ratio': 2.19543317662861, 'translation_length': 3269, 'reference_length': 1489}                |
| 0.4899        | 0.68  | 500  | 0.4844          | {'exact_match': 29.166666666666668, 'f1': 63.91710424811924}  | {'bleu': 0.43831704036458624, 'precisions': [0.49036402569593146, 0.44437278297323446, 0.42144553783508654, 0.40192582025677603], 'brevity_penalty': 1.0, 'length_ratio': 1.307077169132347, 'translation_length': 3269, 'reference_length': 2501}           |
| 0.3923        | 0.82  | 600  | 0.4786          | {'exact_match': 29.166666666666668, 'f1': 62.65868074864891}  | {'bleu': 0.4580633444938437, 'precisions': [0.5062710308962985, 0.4630764269590455, 0.44248388191381066, 0.4243937232524964], 'brevity_penalty': 1.0, 'length_ratio': 1.1974358974358974, 'translation_length': 3269, 'reference_length': 2730}              |
| 0.5221        | 0.96  | 700  | 0.4612          | {'exact_match': 30.952380952380953, 'f1': 62.84973727787755}  | {'bleu': 0.5515506497257063, 'precisions': [0.6044661976139493, 0.563044179297001, 0.5446216491347132, 0.5281740370898717], 'brevity_penalty': 0.9860269604247005, 'length_ratio': 0.9861236802413273, 'translation_length': 3269, 'reference_length': 3315} |
| 0.2892        | 1.1   | 800  | 0.4958          | {'exact_match': 30.952380952380953, 'f1': 63.021615827809455} | {'bleu': 0.4763133340972634, 'precisions': [0.5310492505353319, 0.48081264108352145, 0.4584323040380047, 0.4397289586305278], 'brevity_penalty': 1.0, 'length_ratio': 1.1742097701149425, 'translation_length': 3269, 'reference_length': 2784}              |
| 0.3726        | 1.23  | 900  | 0.4814          | {'exact_match': 28.571428571428573, 'f1': 65.07276724002743}  | {'bleu': 0.4631031930079369, 'precisions': [0.514530437442643, 0.4669461464043857, 0.4462164913471327, 0.4290299572039943], 'brevity_penalty': 1.0, 'length_ratio': 1.3003182179793158, 'translation_length': 3269, 'reference_length': 2514}                |
| 0.3296        | 1.37  | 1000 | 0.4888          | {'exact_match': 30.952380952380953, 'f1': 63.75823569325163}  | {'bleu': 0.3620195814265076, 'precisions': [0.4138880391557051, 0.37181554337310546, 0.34543603664743805, 0.3231098430813124], 'brevity_penalty': 1.0, 'length_ratio': 1.6885330578512396, 'translation_length': 3269, 'reference_length': 1936}             |
| 0.5811        | 1.51  | 1100 | 0.5143          | {'exact_match': 29.166666666666668, 'f1': 61.409131873413195} | {'bleu': 0.41718747361604114, 'precisions': [0.465891710003059, 0.4240567558851983, 0.4014251781472684, 0.3819543509272468], 'brevity_penalty': 1.0, 'length_ratio': 1.4541814946619218, 'translation_length': 3269, 'reference_length': 2248}               |
| 0.3257        | 1.64  | 1200 | 0.5088          | {'exact_match': 32.142857142857146, 'f1': 62.3134623709586}   | {'bleu': 0.3381083910829391, 'precisions': [0.39186295503211993, 0.3473073202192841, 0.32168306752629794, 0.2985021398002853], 'brevity_penalty': 1.0, 'length_ratio': 1.8961716937354989, 'translation_length': 3269, 'reference_length': 1724}             |
| 0.3282        | 1.78  | 1300 | 0.4795          | {'exact_match': 35.714285714285715, 'f1': 64.30776389272819}  | {'bleu': 0.37087590095858125, 'precisions': [0.4227592535943714, 0.38052241212512095, 0.3545978961655921, 0.33166904422253923], 'brevity_penalty': 1.0, 'length_ratio': 1.69730010384216, 'translation_length': 3269, 'reference_length': 1926}              |
| 0.3582        | 1.92  | 1400 | 0.5072          | {'exact_match': 31.547619047619047, 'f1': 64.26308637164007}  | {'bleu': 0.3596752944859653, 'precisions': [0.4083817681248088, 0.36988068365043536, 0.34441805225653205, 0.32168330955777463], 'brevity_penalty': 1.0, 'length_ratio': 1.7951674903898958, 'translation_length': 3269, 'reference_length': 1821}            |
| 0.2635        | 2.05  | 1500 | 0.5226          | {'exact_match': 32.73809523809524, 'f1': 63.86332207970659}   | {'bleu': 0.3937338044365284, 'precisions': [0.44447843377179563, 0.40148339245404707, 0.37801153715643027, 0.35627674750356636], 'brevity_penalty': 1.0, 'length_ratio': 1.4988537368179735, 'translation_length': 3269, 'reference_length': 2181}           |
| 0.2464        | 2.19  | 1600 | 0.5321          | {'exact_match': 33.333333333333336, 'f1': 67.38930420708819}  | {'bleu': 0.45354685012711804, 'precisions': [0.5013765677577241, 0.4601741373750403, 0.43841194435018666, 0.41833095577746077], 'brevity_penalty': 1.0, 'length_ratio': 1.4225413402959095, 'translation_length': 3269, 'reference_length': 2298}            |
| 0.2393        | 2.33  | 1700 | 0.5099          | {'exact_match': 35.714285714285715, 'f1': 65.16842917580313}  | {'bleu': 0.4114344617968027, 'precisions': [0.4588559192413582, 0.41921960657852303, 0.39667458432304037, 0.3755349500713267], 'brevity_penalty': 1.0, 'length_ratio': 1.5977517106549364, 'translation_length': 3269, 'reference_length': 2046}             |
| 0.2676        | 2.47  | 1800 | 0.5987          | {'exact_match': 29.761904761904763, 'f1': 63.78707251910303}  | {'bleu': 0.3658122932518219, 'precisions': [0.42551238910981953, 0.37697516930022573, 0.34781133355955207, 0.3209700427960057], 'brevity_penalty': 1.0, 'length_ratio': 1.7187171398527865, 'translation_length': 3269, 'reference_length': 1902}            |
| 0.3071        | 2.6   | 1900 | 0.5240          | {'exact_match': 29.166666666666668, 'f1': 62.92129166698199}  | {'bleu': 0.4832099557140429, 'precisions': [0.5377791373508718, 0.4872621734924218, 0.46521886664404477, 0.4472182596291013], 'brevity_penalty': 1.0, 'length_ratio': 1.0732107682206171, 'translation_length': 3269, 'reference_length': 3046}              |
| 0.2839        | 2.74  | 2000 | 0.5110          | {'exact_match': 35.714285714285715, 'f1': 65.45284344390186}  | {'bleu': 0.43592975914902093, 'precisions': [0.4857754665035179, 0.4414704933892293, 0.4200882253138785, 0.4008559201141227], 'brevity_penalty': 1.0, 'length_ratio': 1.3502684840974803, 'translation_length': 3269, 'reference_length': 2421}              |
| 0.3259        | 2.88  | 2100 | 0.5020          | {'exact_match': 30.357142857142858, 'f1': 66.21243212925624}  | {'bleu': 0.5131744912223959, 'precisions': [0.5646986846130315, 0.5169300225733634, 0.496776382762131, 0.4782453637660485], 'brevity_penalty': 1.0, 'length_ratio': 1.091121495327103, 'translation_length': 3269, 'reference_length': 2996}                 |
| 0.3272        | 3.01  | 2200 | 0.5239          | {'exact_match': 30.952380952380953, 'f1': 67.09846522112723}  | {'bleu': 0.5405154750897113, 'precisions': [0.5830529213826858, 0.5446630119316349, 0.5262979300984052, 0.5106990014265336], 'brevity_penalty': 1.0, 'length_ratio': 1.1176068376068375, 'translation_length': 3269, 'reference_length': 2925}               |
| 0.1856        | 3.15  | 2300 | 0.5524          | {'exact_match': 33.92857142857143, 'f1': 68.43656293435919}   | {'bleu': 0.49285332133310633, 'precisions': [0.5396145610278372, 0.4982263785875524, 0.47777400746521886, 0.4593437945791726], 'brevity_penalty': 1.0, 'length_ratio': 1.2987683750496624, 'translation_length': 3269, 'reference_length': 2517}             |
| 0.233         | 3.29  | 2400 | 0.5277          | {'exact_match': 30.952380952380953, 'f1': 66.14618600499504}  | {'bleu': 0.47457637309615686, 'precisions': [0.5221780360966657, 0.47887778136085135, 0.4594502884289108, 0.44151212553495006], 'brevity_penalty': 1.0, 'length_ratio': 1.2854895792371215, 'translation_length': 3269, 'reference_length': 2543}            |
| 0.2027        | 3.42  | 2500 | 0.5362          | {'exact_match': 30.952380952380953, 'f1': 67.73189400443187}  | {'bleu': 0.4828086230052905, 'precisions': [0.5273784031814011, 0.4888745565946469, 0.46827281981676283, 0.4500713266761769], 'brevity_penalty': 1.0, 'length_ratio': 1.3672103722291928, 'translation_length': 3269, 'reference_length': 2391}              |
| 0.1462        | 3.56  | 2600 | 0.5681          | {'exact_match': 34.523809523809526, 'f1': 66.90869030594244}  | {'bleu': 0.4470358204326154, 'precisions': [0.49434077699602325, 0.4524346984843599, 0.4316253817441466, 0.4136947218259629], 'brevity_penalty': 1.0, 'length_ratio': 1.4535349044019563, 'translation_length': 3269, 'reference_length': 2249}              |
| 0.2218        | 3.7   | 2700 | 0.5582          | {'exact_match': 30.952380952380953, 'f1': 64.88762964740765}  | {'bleu': 0.3810698909088452, 'precisions': [0.4353013153869685, 0.38987423411802646, 0.36443841194435017, 0.340941512125535], 'brevity_penalty': 1.0, 'length_ratio': 1.5588936576061039, 'translation_length': 3269, 'reference_length': 2097}              |
| 0.2644        | 3.84  | 2800 | 0.5324          | {'exact_match': 33.92857142857143, 'f1': 66.27147689733899}   | {'bleu': 0.46985264711666236, 'precisions': [0.5234016518813093, 0.4746855852950661, 0.4523243976925687, 0.43366619115549215], 'brevity_penalty': 1.0, 'length_ratio': 1.2926057730328193, 'translation_length': 3269, 'reference_length': 2529}             |
| 0.1883        | 3.97  | 2900 | 0.5218          | {'exact_match': 31.547619047619047, 'f1': 65.86132294563414}  | {'bleu': 0.4691457249654658, 'precisions': [0.5151422453349648, 0.4756530151564012, 0.4540210383440787, 0.4354493580599144], 'brevity_penalty': 1.0, 'length_ratio': 1.3165525573902537, 'translation_length': 3269, 'reference_length': 2483}               |
| 0.0983        | 4.11  | 3000 | 0.5570          | {'exact_match': 34.523809523809526, 'f1': 66.88532922899014}  | {'bleu': 0.4637264731606073, 'precisions': [0.5074946466809421, 0.4708158658497259, 0.4496097726501527, 0.4304564907275321], 'brevity_penalty': 1.0, 'length_ratio': 1.3834109183241643, 'translation_length': 3269, 'reference_length': 2363}               |
| 0.1298        | 4.25  | 3100 | 0.5555          | {'exact_match': 31.547619047619047, 'f1': 65.9162762624782}   | {'bleu': 0.4642779990799932, 'precisions': [0.5126950137656776, 0.47049338922928086, 0.44859178825924667, 0.42938659058487877], 'brevity_penalty': 1.0, 'length_ratio': 1.3039489429597129, 'translation_length': 3269, 'reference_length': 2507}            |
| 0.1631        | 4.38  | 3200 | 0.5684          | {'exact_match': 32.73809523809524, 'f1': 67.06642684079529}   | {'bleu': 0.5088183152910757, 'precisions': [0.5512389109819517, 0.5146726862302483, 0.494740413980319, 0.4775320970042796], 'brevity_penalty': 1.0, 'length_ratio': 1.2799530148786218, 'translation_length': 3269, 'reference_length': 2554}                |
| 0.1247        | 4.52  | 3300 | 0.5779          | {'exact_match': 32.142857142857146, 'f1': 66.7145922073288}   | {'bleu': 0.4891310418038067, 'precisions': [0.5359437136739064, 0.4946791357626572, 0.47370206990159486, 0.4557774607703281], 'brevity_penalty': 1.0, 'length_ratio': 1.2592449922958397, 'translation_length': 3269, 'reference_length': 2596}              |
| 0.1375        | 4.66  | 3400 | 0.5710          | {'exact_match': 32.73809523809524, 'f1': 67.12997909883624}   | {'bleu': 0.4788691854085839, 'precisions': [0.5240134597736311, 0.4853273137697517, 0.46420088225313877, 0.445435092724679], 'brevity_penalty': 1.0, 'length_ratio': 1.3485973597359735, 'translation_length': 3269, 'reference_length': 2424}               |
| 0.2406        | 4.79  | 3500 | 0.5565          | {'exact_match': 32.73809523809524, 'f1': 67.80127256596187}   | {'bleu': 0.4744984435764912, 'precisions': [0.5206485163658611, 0.48113511770396644, 0.4594502884289108, 0.4404422253922967], 'brevity_penalty': 1.0, 'length_ratio': 1.4145391605365643, 'translation_length': 3269, 'reference_length': 2311}              |
| 0.1866        | 4.93  | 3600 | 0.5606          | {'exact_match': 31.547619047619047, 'f1': 65.97520968920112}  | {'bleu': 0.4478359370566898, 'precisions': [0.4970939125114714, 0.45436955820703, 0.43196470987444857, 0.4122681883024251], 'brevity_penalty': 1.0, 'length_ratio': 1.3752629364745477, 'translation_length': 3269, 'reference_length': 2377}                |


### Framework versions

- Transformers 4.34.0
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.14.1