Locutusque
commited on
Commit
•
fc19730
1
Parent(s):
d4bd4d7
Update README.md
Browse files
README.md
CHANGED
@@ -36,4 +36,46 @@ The model is evaluated based on several metrics, including loss, reward, penalty
|
|
36 |
- Average loss: 1.7
|
37 |
|
38 |
## Limitations and Bias
|
39 |
-
Because I have a rather weak computer for machine learning, I was not able to train this model for too long. The model may output irrelevant answers, or sometimes the responses can be nonsensical. The Interface API is not a recommended place to test the model because this model requires an input format. This model was not fine-tuned to remember history from the chat, so it cannot be asked follow-up questions (if anyone wants to fine-tune it so that it does remember the chat history, be my guest). A GPU with at least 4 gigabytes of VRAM is recommended for optimal speeds to generate a response.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
36 |
- Average loss: 1.7
|
37 |
|
38 |
## Limitations and Bias
|
39 |
+
Because I have a rather weak computer for machine learning, I was not able to train this model for too long. The model may output irrelevant answers, or sometimes the responses can be nonsensical. The Interface API is not a recommended place to test the model because this model requires an input format. This model was not fine-tuned to remember history from the chat, so it cannot be asked follow-up questions (if anyone wants to fine-tune it so that it does remember the chat history, be my guest). A GPU with at least 4 gigabytes of VRAM is recommended for optimal speeds to generate a response. I also don't recommend loading the model automatically from the transformers library because it does not work as intended. Instead you should use the model like this (download the model and tokenizer manually):
|
40 |
+
|
41 |
+
```python
|
42 |
+
import torch
|
43 |
+
from transformers import GPT2Tokenizer, GPT2LMHeadModel
|
44 |
+
|
45 |
+
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
|
46 |
+
model = GPT2LMHeadModel.from_pretrained('gpt2')
|
47 |
+
tokenizer.add_special_tokens({'pad_token': '[PAD]'})
|
48 |
+
tokenizer.add_special_tokens({'eos_token': '<|End|>'})
|
49 |
+
special_tokens = {
|
50 |
+
"additional_special_tokens": ["<|USER|>", "<|SYSTEM|>", "<|ASSISTANT|>"]
|
51 |
+
}
|
52 |
+
tokenizer.add_special_tokens(special_tokens)
|
53 |
+
model.resize_token_embeddings(len(tokenizer))
|
54 |
+
model.load_state_dict(torch.load("D:\\Projects\\results\\pytorch_model.bin"))
|
55 |
+
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
56 |
+
model.to(device)
|
57 |
+
def generate_text(model, tokenizer, prompt, max_length=1024):
|
58 |
+
prompt = f'<|SYSTEM|> You are a helpful AI designed to answer questions <|USER|> {prompt} <|ASSISTANT|> '
|
59 |
+
input_ids = tokenizer.encode(prompt, add_special_tokens=True, return_tensors="pt").to(device)
|
60 |
+
attention_mask = torch.ones_like(input_ids).to(device)
|
61 |
+
output = model.generate(input_ids,
|
62 |
+
max_length=max_length,
|
63 |
+
do_sample=True,
|
64 |
+
top_k=50,
|
65 |
+
top_p=0.30,
|
66 |
+
pad_token_id=tokenizer.pad_token_id,
|
67 |
+
eos_token_id=tokenizer.eos_token_id,
|
68 |
+
attention_mask=attention_mask)
|
69 |
+
output_ids = tokenizer.decode(output[0], skip_special_tokens=False)
|
70 |
+
assistant_token_index = output_ids.index('<|ASSISTANT|>') + len('<|ASSISTANT|>')
|
71 |
+
next_token_index = output_ids.find('<|', assistant_token_index)
|
72 |
+
output_ids = output_ids[assistant_token_index:next_token_index]
|
73 |
+
return output_ids
|
74 |
+
# Loop to interact with the model
|
75 |
+
while True:
|
76 |
+
prompt = input("Enter a prompt (or 'q' to quit): ")
|
77 |
+
if prompt == "q":
|
78 |
+
break
|
79 |
+
output_text = generate_text(model, tokenizer, prompt)
|
80 |
+
print(output_text)
|
81 |
+
```
|