qsnell commited on
Commit
0ac201e
1 Parent(s): c31b11c

End of training

Browse files
Files changed (2) hide show
  1. README.md +9 -9
  2. generation_config.json +5 -2
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  library_name: transformers
3
- license: apache-2.0
4
- base_model: distilgpt2
5
  tags:
6
  - generated_from_trainer
7
  datasets:
@@ -16,9 +16,9 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  # my_awesome_eli5_clm-model
18
 
19
- This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on the eli5_category dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 3.8283
22
 
23
  ## Model description
24
 
@@ -38,8 +38,8 @@ More information needed
38
 
39
  The following hyperparameters were used during training:
40
  - learning_rate: 2e-05
41
- - train_batch_size: 24
42
- - eval_batch_size: 24
43
  - seed: 42
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: linear
@@ -49,9 +49,9 @@ The following hyperparameters were used during training:
49
 
50
  | Training Loss | Epoch | Step | Validation Loss |
51
  |:-------------:|:-----:|:----:|:---------------:|
52
- | No log | 1.0 | 438 | 3.8407 |
53
- | 3.9608 | 2.0 | 876 | 3.8299 |
54
- | 3.8812 | 3.0 | 1314 | 3.8283 |
55
 
56
 
57
  ### Framework versions
 
1
  ---
2
  library_name: transformers
3
+ license: llama3.2
4
+ base_model: meta-llama/Llama-3.2-1B
5
  tags:
6
  - generated_from_trainer
7
  datasets:
 
16
 
17
  # my_awesome_eli5_clm-model
18
 
19
+ This model is a fine-tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) on the eli5_category dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 3.7905
22
 
23
  ## Model description
24
 
 
38
 
39
  The following hyperparameters were used during training:
40
  - learning_rate: 2e-05
41
+ - train_batch_size: 16
42
+ - eval_batch_size: 16
43
  - seed: 42
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: linear
 
49
 
50
  | Training Loss | Epoch | Step | Validation Loss |
51
  |:-------------:|:-----:|:----:|:---------------:|
52
+ | 3.1211 | 1.0 | 663 | 3.1149 |
53
+ | 2.4299 | 2.0 | 1326 | 3.2926 |
54
+ | 1.7842 | 3.0 | 1989 | 3.7905 |
55
 
56
 
57
  ### Framework versions
generation_config.json CHANGED
@@ -1,6 +1,9 @@
1
  {
2
  "_from_model_config": true,
3
- "bos_token_id": 50256,
4
- "eos_token_id": 50256,
 
 
 
5
  "transformers_version": "4.45.2"
6
  }
 
1
  {
2
  "_from_model_config": true,
3
+ "bos_token_id": 128000,
4
+ "do_sample": true,
5
+ "eos_token_id": 128001,
6
+ "temperature": 0.6,
7
+ "top_p": 0.9,
8
  "transformers_version": "4.45.2"
9
  }