lucifertrj
commited on
Commit
•
151c18f
1
Parent(s):
6442a22
Update README.md
Browse files
README.md
CHANGED
@@ -9,8 +9,6 @@ pipeline_tag: text-generation
|
|
9 |
|
10 |
## Model Description
|
11 |
|
12 |
-
<!-- Provide a quick summary of what the model is/does. -->
|
13 |
-
|
14 |
Buddhi is a general-purpose chat model, meticulously fine-tuned on the Mistral 7B Instruct, and optimised to handle an extended context length of up to 128,000 tokens using the innovative YaRN [(Yet another Rope Extension)](https://arxiv.org/abs/2309.00071) Technique. This enhancement allows Buddhi to maintain a deeper understanding of context in long documents or conversations, making it particularly adept at tasks requiring extensive context retention, such as comprehensive document summarization, detailed narrative generation, and intricate question-answering.
|
15 |
|
16 |
## Dataset Creation
|
@@ -36,13 +34,18 @@ Please check out [Flash Attention 2](https://github.com/Dao-AILab/flash-attentio
|
|
36 |
|
37 |
**Implementation**:
|
38 |
|
|
|
|
|
39 |
```python
|
40 |
from vllm import LLM, SamplingParams
|
41 |
|
42 |
llm = LLM(
|
43 |
-
|
44 |
-
|
45 |
-
|
|
|
|
|
|
|
46 |
)
|
47 |
|
48 |
prompts = [
|
@@ -63,8 +66,12 @@ for output in outputs:
|
|
63 |
generated_text = output.outputs[0].text
|
64 |
print(generated_text)
|
65 |
print("\n\n")
|
|
|
|
|
66 |
```
|
67 |
|
|
|
|
|
68 |
### Transformers - Basic Implementation
|
69 |
|
70 |
```python
|
@@ -155,7 +162,7 @@ In order to leverage instruction fine-tuning, your prompt should be surrounded b
|
|
155 |
|
156 |
```
|
157 |
@misc {Chaitanya890, lucifertrj ,
|
158 |
-
author = {
|
159 |
title = { Buddhi-128k-Chat by AI Planet},
|
160 |
year = 2024,
|
161 |
url = { https://huggingface.co/aiplanet//Buddhi-128K-Chat },
|
|
|
9 |
|
10 |
## Model Description
|
11 |
|
|
|
|
|
12 |
Buddhi is a general-purpose chat model, meticulously fine-tuned on the Mistral 7B Instruct, and optimised to handle an extended context length of up to 128,000 tokens using the innovative YaRN [(Yet another Rope Extension)](https://arxiv.org/abs/2309.00071) Technique. This enhancement allows Buddhi to maintain a deeper understanding of context in long documents or conversations, making it particularly adept at tasks requiring extensive context retention, such as comprehensive document summarization, detailed narrative generation, and intricate question-answering.
|
13 |
|
14 |
## Dataset Creation
|
|
|
34 |
|
35 |
**Implementation**:
|
36 |
|
37 |
+
> Note: The actual hardware requirements to run the model is roughly around 70GB VRAM. For experimentation, we are limiting the context length to 75K instead of 128K. This make it suitable for testing the model in 30-35 GB VRAM
|
38 |
+
|
39 |
```python
|
40 |
from vllm import LLM, SamplingParams
|
41 |
|
42 |
llm = LLM(
|
43 |
+
model='aiplanet/buddhi-128k-chat-7b',
|
44 |
+
trust_remote_code=True,
|
45 |
+
download_dir='aiplanet/buddhi-128k-chat-7b',
|
46 |
+
dtype = 'bfloat16',
|
47 |
+
gpu_memory_utilization=1,
|
48 |
+
max_model_len= 75000
|
49 |
)
|
50 |
|
51 |
prompts = [
|
|
|
66 |
generated_text = output.outputs[0].text
|
67 |
print(generated_text)
|
68 |
print("\n\n")
|
69 |
+
|
70 |
+
# we have also attached a colab notebook, that contains: 2 more experimentations: Long Essay and Entire Book
|
71 |
```
|
72 |
|
73 |
+
For Output, do check out the colab notebook: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/11_8W8FpKK-856QdRVJLyzbu9g-DMxNfg?usp=sharing)
|
74 |
+
|
75 |
### Transformers - Basic Implementation
|
76 |
|
77 |
```python
|
|
|
162 |
|
163 |
```
|
164 |
@misc {Chaitanya890, lucifertrj ,
|
165 |
+
author = { Chaitanya Singhal, Tarun Jain },
|
166 |
title = { Buddhi-128k-Chat by AI Planet},
|
167 |
year = 2024,
|
168 |
url = { https://huggingface.co/aiplanet//Buddhi-128K-Chat },
|