Edit model card

Model Card for Model ID

GPT-6B_Tuned_small_pile is a GPT-j-6B model trained on 0.1 million example of pile dataset.

n_embd: 4096, n_layer: 28, n_positions: 2048

Tuning Parameters:

val_split_percent: 20,

momentum: 0.9 

train_batch_size (eff) : 32

train_micro_batch: 16

gradient_accumulation_steps: 2

gradient_clipping: 0.5

learning_rate: 0.00001

weight_decay: 0.01

lr_schedular: cosine

lr_warmup_steps: 1000

lr_decay: 0.1

lr_decay_step: 2000

mixed_precision: bf16

image.png## Model Details

Model Description

  • Developed by: [More Information Needed]
  • Shared by [optional]: [More Information Needed]
  • Model type: [More Information Needed]
  • Language(s) (NLP): [More Information Needed]
  • License: [More Information Needed]
  • Finetuned from model: EleutherAI/gpt-j-6b
Downloads last month
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Dataset used to train Defalt-404/GPT-6B_Tuned_small_pile