ankitkumar-patel-179
/

llama_instr_tune_fact_checking

Generated from Trainer

Model card Files Files and versions Community

ankitkumar-patel-179 commited on Oct 26, 2023

Commit

6ac2223

•

1 Parent(s): e78999c

Model save

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
-license: other
-base_model: huggyllama/llama-7b
 tags:
 - generated_from_trainer
 model-index:
@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 # llama_instr_tune_fact_checking
-This model is a fine-tuned version of [huggyllama/llama-7b](https://huggingface.co/huggyllama/llama-7b) on the None dataset.
 ## Model description
@@ -33,11 +33,11 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
-- train_batch_size: 16
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.03
@@ -51,5 +51,5 @@ The following hyperparameters were used during training:
 - Transformers 4.34.1
 - Pytorch 2.1.0+cu118
-- Datasets 2.14.5
 - Tokenizers 0.14.1

 ---
+license: apache-2.0
+base_model: kaist-ai/CoT-T5-11B
 tags:
 - generated_from_trainer
 model-index:
 # llama_instr_tune_fact_checking
+This model is a fine-tuned version of [kaist-ai/CoT-T5-11B](https://huggingface.co/kaist-ai/CoT-T5-11B) on the None dataset.
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
+- train_batch_size: 32
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.03
 - Transformers 4.34.1
 - Pytorch 2.1.0+cu118
+- Datasets 2.14.6
 - Tokenizers 0.14.1