qsnell
/

my_awesome_eli5_clm-model

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

qsnell commited on Oct 10

Commit

0ac201e

•

1 Parent(s): c31b11c

End of training

Files changed (2) hide show

README.md +9 -9
generation_config.json +5 -2

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 library_name: transformers
-license: apache-2.0
-base_model: distilgpt2
 tags:
 - generated_from_trainer
 datasets:
@@ -16,9 +16,9 @@ should probably proofread and complete it, then remove this comment. -->
 # my_awesome_eli5_clm-model
-This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on the eli5_category dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.8283
 ## Model description
@@ -38,8 +38,8 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
-- train_batch_size: 24
-- eval_batch_size: 24
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
@@ -49,9 +49,9 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| No log        | 1.0   | 438  | 3.8407          |
-| 3.9608        | 2.0   | 876  | 3.8299          |
-| 3.8812        | 3.0   | 1314 | 3.8283          |
 ### Framework versions

 ---
 library_name: transformers
+license: llama3.2
+base_model: meta-llama/Llama-3.2-1B
 tags:
 - generated_from_trainer
 datasets:
 # my_awesome_eli5_clm-model
+This model is a fine-tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) on the eli5_category dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.7905
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
+- train_batch_size: 16
+- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 3.1211        | 1.0   | 663  | 3.1149          |
+| 2.4299        | 2.0   | 1326 | 3.2926          |
+| 1.7842        | 3.0   | 1989 | 3.7905          |
 ### Framework versions

generation_config.json CHANGED Viewed

@@ -1,6 +1,9 @@
 {
   "_from_model_config": true,
-  "bos_token_id": 50256,
-  "eos_token_id": 50256,
   "transformers_version": "4.45.2"
 }

 {
   "_from_model_config": true,
+  "bos_token_id": 128000,
+  "do_sample": true,
+  "eos_token_id": 128001,
+  "temperature": 0.6,
+  "top_p": 0.9,
   "transformers_version": "4.45.2"
 }