Jungwonchang
/

Ko-QWEN-7B-Chat-LoRA

Model card Files Files and versions Community

Jungwonchang commited on Nov 15, 2023

Commit

680febb

•

1 Parent(s): c67c06c

Update README.md

Files changed (1) hide show

README.md +6 -17

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ language:
 ---
 # Model Card for Model ID
-Korean Chatbot based on Alibaba's QWEN
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6232fdee38869c4ca8fd49e2/CBQ0cdD54Sd7-rbNt-Mkb.png)
 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1fmcq1YZaIYg-cuCS4aadomutLmzSyEYI#scrollTo=6c1edcdc-158d-4043-a7c7-1d145ebf2cd1)
 (keep in mind that basic colab runtime with T4 GPU will lead to OOM error. Fine-tuned version of Qwen-14b-Chat-Int4 will not have this issue)
@@ -190,21 +190,10 @@ response = qwen_chat_single_turn(model, tokenizer, device, query=query,
 ### Training Procedure
 <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
 ## Evaluation
@@ -296,8 +285,8 @@ Jungwon Chang
 ## Model Card Contact
-[More Information Needed]
 ## Training procedure

 ---
 # Model Card for Model ID
+Korean Chatbot based on Alibaba's [QWEN](https://github.com/QwenLM/Qwen)
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6232fdee38869c4ca8fd49e2/CBQ0cdD54Sd7-rbNt-Mkb.png)
 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1fmcq1YZaIYg-cuCS4aadomutLmzSyEYI#scrollTo=6c1edcdc-158d-4043-a7c7-1d145ebf2cd1)
 (keep in mind that basic colab runtime with T4 GPU will lead to OOM error. Fine-tuned version of Qwen-14b-Chat-Int4 will not have this issue)
 ### Training Procedure
 <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+The model was fine-tuned using LoRA (Low-Rank Adaptation), which allows for efficient training of large language models by updating only a small set of parameters.
+The fine-tuning process was conducted on a single node with 2 GPUs, utilizing distributed training to enhance the training efficiency and speed.
+The lora rank was set to 32, for I only had limited time to access the GPUs.
 ## Evaluation
 ## Model Card Contact
+[email protected]
+[email protected]
 ## Training procedure