Initial Commit

Browse files

Files changed (4) hide show

README.md +32 -18
eval_results_cardiff.json +1 -1
pytorch_model.bin +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -18,9 +18,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 4.6543
-- Accuracy: 0.5131
-- F1: 0.5145
 ## Model description
@@ -40,8 +40,8 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 64
-- eval_batch_size: 128
 - seed: 66
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
@@ -51,19 +51,33 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
-| No log        | 2.17  | 250  | 1.4134          | 0.5069   | 0.5083 |
-| 0.5524        | 4.35  | 500  | 1.9853          | 0.5208   | 0.5230 |
-| 0.5524        | 6.52  | 750  | 2.5990          | 0.4853   | 0.4797 |
-| 0.1315        | 8.7   | 1000 | 2.8603          | 0.4961   | 0.4954 |
-| 0.1315        | 10.87 | 1250 | 3.1408          | 0.5093   | 0.5099 |
-| 0.0497        | 13.04 | 1500 | 3.3859          | 0.5177   | 0.5190 |
-| 0.0497        | 15.22 | 1750 | 3.9204          | 0.5039   | 0.5044 |
-| 0.0219        | 17.39 | 2000 | 4.0747          | 0.5139   | 0.5160 |
-| 0.0219        | 19.57 | 2250 | 4.3170          | 0.5139   | 0.5156 |
-| 0.0133        | 21.74 | 2500 | 4.5924          | 0.5023   | 0.5020 |
-| 0.0133        | 23.91 | 2750 | 4.6042          | 0.5100   | 0.5114 |
-| 0.0046        | 26.09 | 3000 | 4.5407          | 0.5147   | 0.5163 |
-| 0.0046        | 28.26 | 3250 | 4.6543          | 0.5131   | 0.5145 |
 ### Framework versions

 This model is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.7992
+- Accuracy: 0.5154
+- F1: 0.5146
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 32
+- eval_batch_size: 64
 - seed: 66
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
+| No log        | 1.09  | 250  | 1.2801          | 0.5069   | 0.5029 |
+| 0.7278        | 2.17  | 500  | 1.6260          | 0.5077   | 0.5080 |
+| 0.7278        | 3.26  | 750  | 1.6500          | 0.5193   | 0.5209 |
+| 0.3512        | 4.35  | 1000 | 2.1813          | 0.5123   | 0.5144 |
+| 0.3512        | 5.43  | 1250 | 2.5133          | 0.5154   | 0.5167 |
+| 0.1838        | 6.52  | 1500 | 2.6502          | 0.5093   | 0.5093 |
+| 0.1838        | 7.61  | 1750 | 3.0408          | 0.5015   | 0.5021 |
+| 0.118         | 8.7   | 2000 | 3.3486          | 0.4877   | 0.4822 |
+| 0.118         | 9.78  | 2250 | 3.5117          | 0.4923   | 0.4906 |
+| 0.072         | 10.87 | 2500 | 3.5966          | 0.5046   | 0.5027 |
+| 0.072         | 11.96 | 2750 | 3.3823          | 0.5100   | 0.5121 |
+| 0.0545        | 13.04 | 3000 | 3.7627          | 0.5085   | 0.5053 |
+| 0.0545        | 14.13 | 3250 | 3.9342          | 0.5108   | 0.5124 |
+| 0.0336        | 15.22 | 3500 | 4.2215          | 0.5093   | 0.5061 |
+| 0.0336        | 16.3  | 3750 | 4.2219          | 0.5046   | 0.5021 |
+| 0.0272        | 17.39 | 4000 | 4.0061          | 0.5208   | 0.5227 |
+| 0.0272        | 18.48 | 4250 | 4.3214          | 0.5116   | 0.5074 |
+| 0.0198        | 19.57 | 4500 | 4.5333          | 0.5093   | 0.5075 |
+| 0.0198        | 20.65 | 4750 | 4.3535          | 0.5247   | 0.5256 |
+| 0.0161        | 21.74 | 5000 | 4.5169          | 0.5239   | 0.5238 |
+| 0.0161        | 22.83 | 5250 | 4.4982          | 0.5285   | 0.5298 |
+| 0.012         | 23.91 | 5500 | 4.5591          | 0.5170   | 0.5186 |
+| 0.012         | 25.0  | 5750 | 4.7615          | 0.5085   | 0.5069 |
+| 0.0066        | 26.09 | 6000 | 4.8457          | 0.5100   | 0.5079 |
+| 0.0066        | 27.17 | 6250 | 4.7872          | 0.5131   | 0.5118 |
+| 0.0069        | 28.26 | 6500 | 4.6257          | 0.5301   | 0.5303 |
+| 0.0069        | 29.35 | 6750 | 4.7992          | 0.5154   | 0.5146 |
 ### Framework versions

eval_results_cardiff.json CHANGED Viewed

@@ -1 +1 @@

- {"arabic": {"f1": 0.~~602933121281396~~, "accuracy": 0.~~6011494252873564~~, "confusion_matrix": [[~~149~~, ~~105~~, 36], [46, ~~204~~, 40], [18, ~~102~~, ~~170~~]]}, "english": {"f1": 0.~~7045149108105302~~, "accuracy": 0.~~7103448275862069~~, "confusion_matrix": [[~~250~~, 34, 6], [91, ~~151~~, 48], [21, 52, ~~217~~]]}, "french": {"f1": 0.~~5256412382793743~~, "accuracy": 0.~~5494252873563218~~, "confusion_matrix": [[~~152~~, ~~125~~, 13], [21, ~~257~~, 12], [21, ~~200~~, 69]]}, "german": {"f1": 0.~~7476717511679669~~, "accuracy": 0.~~7494252873563219~~, "confusion_matrix": [[~~231~~, 32, 27], [55, ~~188~~, 47], [25, 32, ~~233~~]]}, "hindi": {"f1": 0.~~5333296333296333~~, "accuracy": 0.~~532183908045977~~, "confusion_matrix": [[~~147~~, 93, 50], [61, ~~150~~, 79], [41, 83, ~~166~~]]}, "italian": {"f1": 0.~~7290103738322573~~, "accuracy": 0.~~728735632183908~~, "confusion_matrix": [[~~193~~, 63, 34], [23, ~~239~~, 28], [21, 67, ~~202~~]]}, "portuguese": {"f1": 0.~~664227795592618~~, "accuracy": 0.~~667816091954023~~, "confusion_matrix": [[~~186~~, 54, 50], [50, ~~161~~, 79], [24, 32, ~~234~~]]}, "spanish": {"f1": 0.~~6453352385900618~~, "accuracy": 0.~~6494252873563219~~, "confusion_matrix": [[~~223~~, 47, 20], [87, ~~142~~, 61], [26, 64, ~~200~~]]}, "all": {"f1": 0.~~6441442370542694~~, "accuracy": 0.~~6422413793103449~~, "confusion_matrix": [[~~1518~~, ~~558~~, ~~244~~], [~~439~~, ~~1439~~, ~~442~~], [~~194~~, ~~613~~, ~~1513~~]]}}

+ {"arabic": {"f1": 0.5940991842458893, "accuracy": 0.593103448275862, "confusion_matrix": [[146, 113, 31], [48, 215, 27], [20, 115, 155]]}, "english": {"f1": 0.6767487000243078, "accuracy": 0.6793103448275862, "confusion_matrix": [[235, 47, 8], [85, 156, 49], [27, 63, 200]]}, "french": {"f1": 0.5091797893031121, "accuracy": 0.539080459770115, "confusion_matrix": [[147, 134, 9], [23, 259, 8], [33, 194, 63]]}, "german": {"f1": 0.7221885493493773, "accuracy": 0.7229885057471265, "confusion_matrix": [[222, 40, 28], [62, 187, 41], [36, 34, 220]]}, "hindi": {"f1": 0.5148790318312105, "accuracy": 0.5172413793103449, "confusion_matrix": [[155, 64, 71], [88, 123, 79], [65, 53, 172]]}, "italian": {"f1": 0.7147248790960297, "accuracy": 0.7160919540229885, "confusion_matrix": [[183, 63, 44], [19, 242, 29], [28, 64, 198]]}, "portuguese": {"f1": 0.6633915248733473, "accuracy": 0.6632183908045977, "confusion_matrix": [[200, 59, 31], [60, 177, 53], [33, 57, 200]]}, "spanish": {"f1": 0.6573314978575006, "accuracy": 0.6609195402298851, "confusion_matrix": [[218, 46, 26], [80, 147, 63], [18, 62, 210]]}, "all": {"f1": 0.6383252563181864, "accuracy": 0.6364942528735632, "confusion_matrix": [[1526, 562, 232], [471, 1494, 355], [245, 665, 1410]]}}

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4a03fb5668e091e8dd92052136aae174bab9ff4915d964dc5ea826bec8423d13
 size 1115316658

 version https://git-lfs.github.com/spec/v1
+oid sha256:e316b4306d445aae1da6157e80f941444e965b505c69661a1294aab3a5af6d4a
 size 1115316658

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:52453acfb8e342e114302b737611fd2cec5ad6385d83476c890b56eb1f71e974
 size 4536

 version https://git-lfs.github.com/spec/v1
+oid sha256:7941163c8fd184d99820f8db753da01c36d7b29d24920d212b3f00f614ef85cd
 size 4536