--- datasets: - ola13/small-the_pile language: - en metrics: - accuracy - code_eval pipeline_tag: text-generation tags: - casual_lm --- # Model Card for Model ID GPT-6B_Tuned_small_pile is a GPT-j-6B model trained on 0.1 million example of pile dataset. n_embd: 4096, n_layer: 28, n_positions: 2048 Tuning Parameters: val_split_percent: 20, momentum: 0.9 train_batch_size (eff) : 32 train_micro_batch: 16 gradient_accumulation_steps: 2 gradient_clipping: 0.5 learning_rate: 0.00001 weight_decay: 0.01 lr_schedular: cosine lr_warmup_steps: 1000 lr_decay: 0.1 lr_decay_step: 2000 mixed_precision: bf16 ![image.png](https://s3.amazonaws.com/moonup/production/uploads/642bb1915df44ff245471fca/Ke-ShGT0sBVGEjrShxped.png)## Model Details ### Model Description - **Developed by:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Model type:** [More Information Needed] - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] - **Finetuned from model:** EleutherAI/gpt-j-6b