Update README.md
Browse files
README.md
CHANGED
@@ -25,7 +25,9 @@ tags:
|
|
25 |
<i>or what ChatGPT suggests, <b>"Crafting a Rapid prototype of an Intelligent llm App using open source resources"</b>.</i>
|
26 |
</p>
|
27 |
|
28 |
-
The initial objective of the CRIA project is to develop a comprehensive end-to-end chatbot system, starting from the instruction-tuning of a large language model and extending to its deployment on the web using frameworks such as Next.js.
|
|
|
|
|
29 |
|
30 |
## 📦 Model Release
|
31 |
|
@@ -39,6 +41,26 @@ CRIA v1.3 comes with several variants.
|
|
39 |
|
40 |
It was trained on a Google Colab notebook with a T4 GPU and high RAM.
|
41 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
42 |
## 💻 Usage
|
43 |
|
44 |
```python
|
@@ -77,3 +99,4 @@ We'd like to thank:
|
|
77 |
|
78 |
- [mlabonne](https://huggingface.co/mlabonne) for his article and resources on implementation of instruction tuning
|
79 |
- [TheBloke](https://huggingface.co/TheBloke) for his script for LLM quantization.
|
|
|
|
25 |
<i>or what ChatGPT suggests, <b>"Crafting a Rapid prototype of an Intelligent llm App using open source resources"</b>.</i>
|
26 |
</p>
|
27 |
|
28 |
+
The initial objective of the CRIA project is to develop a comprehensive end-to-end chatbot system, starting from the instruction-tuning of a large language model and extending to its deployment on the web using frameworks such as Next.js.
|
29 |
+
|
30 |
+
Specifically, we have fine-tuned the `llama-2-7b-chat-hf` model with QLoRA (4-bit precision) using the [mlabonne/CodeLlama-2-20k](https://huggingface.co/datasets/mlabonne/CodeLlama-2-20k) dataset. This fine-tuned model serves as the backbone for the [CRIA chat](https://chat.walterteng.com) platform.
|
31 |
|
32 |
## 📦 Model Release
|
33 |
|
|
|
41 |
|
42 |
It was trained on a Google Colab notebook with a T4 GPU and high RAM.
|
43 |
|
44 |
+
### Training procedure
|
45 |
+
|
46 |
+
|
47 |
+
The following `bitsandbytes` quantization config was used during training:
|
48 |
+
- load_in_8bit: False
|
49 |
+
- load_in_4bit: True
|
50 |
+
- llm_int8_threshold: 6.0
|
51 |
+
- llm_int8_skip_modules: None
|
52 |
+
- llm_int8_enable_fp32_cpu_offload: False
|
53 |
+
- llm_int8_has_fp16_weight: False
|
54 |
+
- bnb_4bit_quant_type: nf4
|
55 |
+
- bnb_4bit_use_double_quant: False
|
56 |
+
- bnb_4bit_compute_dtype: float16
|
57 |
+
|
58 |
+
### Framework versions
|
59 |
+
|
60 |
+
|
61 |
+
- PEFT 0.4.0
|
62 |
+
|
63 |
+
|
64 |
## 💻 Usage
|
65 |
|
66 |
```python
|
|
|
99 |
|
100 |
- [mlabonne](https://huggingface.co/mlabonne) for his article and resources on implementation of instruction tuning
|
101 |
- [TheBloke](https://huggingface.co/TheBloke) for his script for LLM quantization.
|
102 |
+
|