Final model pretrained on the greek-longformer after 3 epochs

Files changed (8) hide show

README.md CHANGED Viewed

@@ -1,25 +1,23 @@
 ---
 language:
-- el
 tags:
-- text
-- language-modeling
 metrics:
-- accuracy
 model-index:
-- name: longformer
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# longformer
-This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.9004
-- Accuracy: 0.3706
 ## Model description
@@ -38,23 +36,23 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
-- gradient_accumulation_steps: 4
-- total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 3.0
 ### Training results
 ### Framework versions
-- Transformers 4.27.0.dev0
-- Pytorch 1.13.1+cu117
-- Datasets 2.9.0
 - Tokenizers 0.13.2

 ---
 language:
+  - el
 tags:
+  - text
+  - language-modeling
 metrics:
+  - accuracy
 model-index:
+  - name: greek-media-longformer-base-4096-uncased
+    results: []
 ---
+# Greek Media Longformer
+This model is a second-stage pretrained version of [dimitriz/greek-longformer-base-4096](https://huggingface.co/dimitriz/greek-longformer-base-4096) trained on the [dimitriz/greek_media_texts](https://huggingface.co/datasets/dimitriz/greek_media_texts) dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.1910
+- Accuracy: 0.7482
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.1
 - num_epochs: 3.0
 ### Training results
 ### Framework versions
+- Transformers 4.28.0.dev0
+- Pytorch 2.0.0+cu118
+- Datasets 2.11.0
 - Tokenizers 0.13.2

config.json CHANGED Viewed

@@ -38,5 +38,5 @@
   "torch_dtype": "float32",
   "transformers_version": "4.26.1",
   "type_vocab_size": 2,
-  "vocab_size": 50265
 }

   "torch_dtype": "float32",
   "transformers_version": "4.26.1",
   "type_vocab_size": 2,
+  "vocab_size": 52000
 }

merges.txt CHANGED Viewed

The diff for this file is too large to render. See raw diff

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c341a01633820524a46ecf28b8a0efa2f924f1300f919d3981ba5becd47402f9
-size 594943893

 version https://git-lfs.github.com/spec/v1
+oid sha256:6bb69f67126e7771b15767ea1eac076420a61bd8a2e33fdf2eb8ef380f45403b
+size 600280789

tf_model.h5 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a581b6fd2919d140b593ee81fee7078311b9e00f6effd5cec2bf0c51f2963a9e
-size 594983160

 version https://git-lfs.github.com/spec/v1
+oid sha256:ae841c3dfdaef7605e51056105c3d361313b0140ec7d15788da2133c3b45f9d7
+size 600313080

tokenizer.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json CHANGED Viewed

@@ -13,7 +13,6 @@
     "single_word": false
   },
   "model_max_length": 4096,
-  "name_or_path": "greek-media-longformer-4096",
   "pad_token": "<pad>",
   "sep_token": "</s>",
   "special_tokens_map_file": null,

     "single_word": false
   },
   "model_max_length": 4096,
   "pad_token": "<pad>",
   "sep_token": "</s>",
   "special_tokens_map_file": null,

vocab.json CHANGED Viewed

The diff for this file is too large to render. See raw diff