eunyounglee commited on
Commit
a788b47
1 Parent(s): 2f9a1a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -12
README.md CHANGED
@@ -5,24 +5,46 @@ pipeline_tag: text-generation
5
 
6
  Trained: Fine-tuning
7
  Config file: 2.7B
8
- Data: Vietnamese Alpaca(16412 rows) + Vietnamese QA Dataset based on viwik18(14293 rows)
9
  ---
10
  # Model Card for Model ID
11
 
12
- <!-- Provide a quick summary of what the model is/does. -->
13
-
14
- Pretrained GPT-NeoX model with 450GB+ Vietnamese dataset. Fine-tuned with 12MB Question & Answer dataset. Took 18 hours to reach 10 epochs. Trained on A100 40GB GPU and 48 core CPU.
15
 
16
  ## Model Details
17
- Config file: 2.7B
18
- Data: Vietnamese Alpaca(16412 rows) + Vietnamese QA Dataset based on viwik18(14293 rows)
19
-
20
- ### Model Description
21
 
22
- <!-- Provide a longer summary of what this model is. -->
 
 
 
 
 
23
 
24
-
25
-
26
- - **Developed by:** Eunyoung Lee
27
  - **Model type:** GPT-NeoX
28
  - **Language(s) (NLP):** Vietnamese
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
 
6
  Trained: Fine-tuning
7
  Config file: 2.7B
 
8
  ---
9
  # Model Card for Model ID
10
 
11
+ This model is pretrained and fine-tuned with Vietnamese dataset, based on GPT-NeoX which is a large language model developed by EleutherAI.
12
+ Pretrained GPT-NeoX model with 450GB+ Vietnamese dataset. Fine-tuned with 12MB Vietnamese Question & Answer dataset. Trained on A100 40GB GPU and 48 core CPU. Took 18 hours to reach 10 epochs.
 
13
 
14
  ## Model Details
 
 
 
 
15
 
16
+ ### Training Data
17
+ - **Pre-train:**
18
+ Vietnamese CulturaX Dataset(450GB) + Project(1.3GB) + Crawled Vietnamese Wikipedia(630MB) + viwik18(1.27GB)
19
+ - **Fine-tuning:**
20
+ 12MB Vietnamese Question & Answer dataset
21
+ Vietnamese Alpaca(16412 rows) + Vietnamese QA Dataset based on viwik18(14293 rows)
22
 
23
+ ### Training Hardware
24
+ - **Developed by:** Deeploading
 
25
  - **Model type:** GPT-NeoX
26
  - **Language(s) (NLP):** Vietnamese
27
+
28
+ <figure style="width:30em">
29
+
30
+ | Hyperparameter | Value |
31
+ | ---------------------- | ----------- |
32
+ | n<sub>parameters</sub> | 2670182400 |
33
+ | n<sub>layers</sub> | 32 |
34
+ | d<sub>model</sub> | 2560 |
35
+ | n<sub>heads</sub> | 32 |
36
+ | d<sub>head</sub> | 128 |
37
+ | n<sub>vocab</sub> | 60000 |
38
+ | Sequence Length | 2048 |
39
+ | Learning Rate | 0.00016 |
40
+ | Positional Encoding | [Rotary Position Embedding (RoPE)](https://arxiv.org/abs/2104.09864) |
41
+ </figure>
42
+
43
+ ### How to use
44
+ The model can be loaded using the `AutoModelForCausalLM` functionality:
45
+ ```python
46
+ from transformers import AutoTokenizer, AutoModelForCausalLM
47
+
48
+ tokenizer = AutoTokenizer.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-finetune")
49
+ model = AutoModelForCausalLM.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-finetune")
50
+ ```