File size: 1,278 Bytes
dd04e2e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38a656e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dd04e2e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
---
datasets:
- ola13/small-the_pile
language:
- en
metrics:
- accuracy
- code_eval
pipeline_tag: text-generation
tags:
- casual_lm
---
# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->

GPT-6B_Tuned_small_pile is a GPT-j-6B model trained on 0.1 million example of pile dataset.

n_embd: 4096, n_layer: 28, n_positions: 2048

Tuning Parameters:
  
    val_split_percent: 20,
  
    momentum: 0.9 
  
    train_batch_size (eff) : 32
  
    train_micro_batch: 16
  
    gradient_accumulation_steps: 2
    
    gradient_clipping: 0.5
    
    learning_rate: 0.00001
    
    weight_decay: 0.01
    
    lr_schedular: cosine
    
    lr_warmup_steps: 1000
    
    lr_decay: 0.1
    
    lr_decay_step: 2000
    
    mixed_precision: bf16



![image.png](https://s3.amazonaws.com/moonup/production/uploads/642bb1915df44ff245471fca/Ke-ShGT0sBVGEjrShxped.png)## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->



- **Developed by:** [More Information Needed]
- **Shared by [optional]:** [More Information Needed]
- **Model type:** [More Information Needed]
- **Language(s) (NLP):** [More Information Needed]
- **License:** [More Information Needed]
- **Finetuned from model:** EleutherAI/gpt-j-6b