eunyounglee
/

GPT-NeoX-2.7B-Vietnamese-finetune

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

eunyounglee commited on Dec 6, 2023

Commit

a788b47

•

1 Parent(s): 2f9a1a9

Update README.md

Files changed (1) hide show

README.md +34 -12

README.md CHANGED Viewed

@@ -5,24 +5,46 @@ pipeline_tag: text-generation
 Trained: Fine-tuning
 Config file: 2.7B
-Data: Vietnamese Alpaca(16412 rows) + Vietnamese QA Dataset based on viwik18(14293 rows)
 ---
 # Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-Pretrained GPT-NeoX model with 450GB+ Vietnamese dataset. Fine-tuned with 12MB Question & Answer dataset. Took 18 hours to reach 10 epochs. Trained on A100 40GB GPU and 48 core CPU.
 ## Model Details
-Config file: 2.7B
-Data: Vietnamese Alpaca(16412 rows) + Vietnamese QA Dataset based on viwik18(14293 rows)
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** Eunyoung Lee
 - **Model type:** GPT-NeoX
 - **Language(s) (NLP):** Vietnamese

 Trained: Fine-tuning
 Config file: 2.7B
 ---
 # Model Card for Model ID
+This model is pretrained and fine-tuned with Vietnamese dataset, based on GPT-NeoX which is a large language model developed by EleutherAI.
+Pretrained GPT-NeoX model with 450GB+ Vietnamese dataset. Fine-tuned with 12MB Vietnamese Question & Answer dataset. Trained on A100 40GB GPU and 48 core CPU. Took 18 hours to reach 10 epochs.
 ## Model Details
+### Training Data
+- **Pre-train:**
+Vietnamese CulturaX Dataset(450GB) + Project(1.3GB) + Crawled Vietnamese Wikipedia(630MB) + viwik18(1.27GB)
+- **Fine-tuning:**
+12MB Vietnamese Question & Answer dataset
+Vietnamese Alpaca(16412 rows) + Vietnamese QA Dataset based on viwik18(14293 rows)
+### Training Hardware
+- **Developed by:** Deeploading
 - **Model type:** GPT-NeoX
 - **Language(s) (NLP):** Vietnamese
+<figure style="width:30em">
+| Hyperparameter         | Value       |
+| ---------------------- | ----------- |
+| n<sub>parameters</sub> | 2670182400  |
+| n<sub>layers</sub>     | 32          |
+| d<sub>model</sub>      | 2560        |
+| n<sub>heads</sub>      | 32          |
+| d<sub>head</sub>       | 128         |
+| n<sub>vocab</sub>      | 60000       |
+| Sequence Length        | 2048        |
+| Learning Rate          | 0.00016     |
+| Positional Encoding    | [Rotary Position Embedding (RoPE)](https://arxiv.org/abs/2104.09864) |
+</figure>
+### How to use
+ The model can be loaded using the `AutoModelForCausalLM` functionality:
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-finetune")
+model = AutoModelForCausalLM.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-finetune")
+```