File size: 11,778 Bytes
c507515 90ef8e1 c507515 1a62ef2 c507515 1a62ef2 c507515 30cad76 5fdc3bf 5d18c0a 5fdc3bf 3f36c72 a80aa3f 503a097 f062968 5fdc3bf f062968 5f0a194 c9f9ceb f062968 c507515 b60e5d6 c507515 47c0cfa c507515 7356321 c507515 8f396cb a954f40 8f396cb c507515 3e83023 7356321 70a90e3 d88b6b1 cc2699e d88b6b1 1554ef2 0fe3bc5 c72a31e 9fd3876 7356321 c507515 69822e4 c507515 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 |
---
license: mit
datasets:
- Locutusque/InstructMix
language:
- en
metrics:
- bleu
- perplexity
- loss
- accuracy
pipeline_tag: text-generation
widget:
- text: >-
<|USER|> Design a Neo4j database and Cypher function snippet to Display
Extreme Dental hygiene: Using Mouthwash for Analysis for Beginners.
Implement if/else or switch/case statements to handle different conditions
related to the Consent. Provide detailed comments explaining your control
flow and the reasoning behind each decision. <|ASSISTANT|>
- text: >-
<|USER|> Write me a story about a magical place. <|ASSISTANT|>
- text: >-
<|USER|> Write me an essay about the life of George Washington <|ASSISTANT|>
- text: >-
<|USER|> Solve the following equation 2x + 10 = 20 <|ASSISTANT|>
- text: >-
<|USER|> Craft me a list of some nice places to visit around the world. <|ASSISTANT|>
- text: >-
<|USER|> How to manage a lazy employee: Address the employee verbally. Don't allow an employee's laziness or lack of enthusiasm to become a recurring issue. Tell the employee you're hoping to speak with them about workplace expectations and performance, and schedule a time to sit down together. Question: To manage a lazy employee, it is suggested to talk to the employee. True, False, or Neither? <|ASSISTANT|>
inference:
parameters:
temperature: 0.8
do_sample: True
top_p: 0.14
top_k: 41
max_new_tokens: 250
repetition_penalty: 1.176
---
# Model Card
## Model Details
- Model Name: gpt2-xl-conversational
- Model Type: Language Modeling
- Task: Generating Conversational Responses
- Hardware: 1x Nvidia Titan V
- Description: This model is trained on a dataset of conversations between a user and an AI assistant, with the goal of generating a coherent and relevant response to the user's input. It uses the GPT-2 architecture, a state-of-the-art transformer-based language model that is capable of generating high-quality text with a wide range of styles and tones. The model is fine-tuned on the conversational data using maximum likelihood estimation, and is evaluated based on its ability to generate responses that are both grammatically correct and semantically relevant to the user's input.
## Intended Use
This model is intended to be used for generating conversational responses in a variety of contexts, such as chatbots, virtual assistants, and customer service applications. It is designed to provide natural and engaging responses to user input, with a focus on maintaining a consistent tone and style throughout the conversation. The model is suitable for use in both text-based and voice-based interfaces, and can be easily integrated into existing applications using the PyTorch and Transformers frameworks.
## Training Data
The model is trained on a large dataset of conversational data, consisting of interactions between users and an AI assistant. The data is preprocessed to remove any sensitive information and is formatted in a way that is suitable for training a language model. The training data is split into a training set and a validation set, with the training set used to update the model parameters and the validation set used to evaluate the model performance. The model was trained on 300,000 examples and achieved excellent metrics.
## Model Architecture
The model architecture used in this model is GPT-2, a transformer-based language model that is capable of generating high-quality text with a wide range of styles and tones. The GPT-2 architecture consists of a multi-layered decoder-only transformer, with self-attention mechanisms that allow the model to capture long-term dependencies and generate coherent text.
## Evaluation Metrics
The model is evaluated based on several metrics, including loss, reward, penalty, BLEU score, and perplexity. The loss metric is calculated during training and reflects the difference between the predicted output and the actual output. The reward metric is based on the number of correct words generated by the model, while the penalty metric penalizes the model for repeating words consecutively. The BLEU score measures the similarity between the generated text and the ground truth text, while the perplexity metric measures how well the model is able to predict the next word in a sequence. During training, the model achieved the following metrics:
- BLEU score: 52
- Accuracy: 53
- perplexity: 4.3
Evaluation metrics:
| Task |Version|Metric|Value| |Stderr|
|--------|------:|------|----:|---|-----:|
|pubmedqa| 0|acc |0.536|± |0.0223
|arc_challenge| 0|acc_norm |0.2867|± |0.0132|
|arc_easy | 0|acc |0.5804|± |0.0101|
|arc_easy | 0|acc_norm|0.5707|±|0.0102|
|winogrande| 0|acc |0.5691|± |0.0139|
|truthfulqa_mc| 1|mc2 |0.3918|± |0.0144|
|anli_r1| 0|acc |0.338|± |0.0150|
|anli_r2| 0|acc |0.346|± |0.0151|
|anli_r3| 0|acc |0.355|± |0.0138|
|drop| 1|f1 |0.0034|± |0.0004|
|hendrycksTest-abstract_algebra | 1|acc | 0.32|± |0.0952|
|hendrycksTest-anatomy | 1|acc | 0.44|± |0.1013|
|hendrycksTest-astronomy | 1|acc | 0.24|± |0.0872|
|hendrycksTest-business_ethics | 1|acc | 0.24|± |0.0872|
|hendrycksTest-clinical_knowledge | 1|acc | 0.24|± |0.0872|
|hendrycksTest-college_biology | 1|acc | 0.20|± |0.0816|
|hendrycksTest-college_chemistry | 1|acc | 0.40|± |0.1000|
|hendrycksTest-college_computer_science | 1|acc | 0.36|± |0.0980|
|hendrycksTest-college_mathematics | 1|acc | 0.48|± |0.1020|
|hendrycksTest-college_medicine | 1|acc | 0.20|± |0.0816|
|hendrycksTest-college_physics | 1|acc | 0.44|± |0.1013|
|hendrycksTest-computer_security | 1|acc | 0.16|± |0.0748|
|hendrycksTest-conceptual_physics | 1|acc | 0.12|± |0.0663|
|hendrycksTest-econometrics | 1|acc | 0.16|± |0.0748|
|hendrycksTest-electrical_engineering | 1|acc | 0.28|± |0.0917|
|hendrycksTest-elementary_mathematics | 1|acc | 0.36|± |0.0980|
|hendrycksTest-formal_logic | 1|acc | 0.44|± |0.1013|
|hendrycksTest-global_facts | 1|acc | 0.20|± |0.0816|
|hendrycksTest-high_school_biology | 1|acc | 0.20|± |0.0816|
|hendrycksTest-high_school_chemistry | 1|acc | 0.28|± |0.0917|
|hendrycksTest-high_school_computer_science | 1|acc | 0.24|± |0.0872|
|hendrycksTest-high_school_european_history | 1|acc | 0.32|± |0.0952|
|hendrycksTest-high_school_geography | 1|acc | 0.32|± |0.0952|
|hendrycksTest-high_school_government_and_politics| 1|acc | 0.28|± |0.0917|
|hendrycksTest-high_school_macroeconomics | 1|acc | 0.28|± |0.0917|
|hendrycksTest-high_school_mathematics | 1|acc | 0.20|± |0.0816|
|hendrycksTest-high_school_microeconomics | 1|acc | 0.24|± |0.0872|
|hendrycksTest-high_school_physics | 1|acc | 0.28|± |0.0917|
|hendrycksTest-high_school_psychology | 1|acc | 0.32|± |0.0952|
|hendrycksTest-high_school_statistics | 1|acc | 0.40|± |0.1000|
|hendrycksTest-high_school_us_history | 1|acc | 0.32|± |0.0952|
|hendrycksTest-high_school_world_history | 1|acc | 0.36|± |0.0980||
|hendrycksTest-human_aging | 1|acc | 0.16|± |0.0748|
|hendrycksTest-human_sexuality | 1|acc | 0.40|± |0.1000|
|hendrycksTest-international_law | 1|acc | 0.24|± |0.0872|
|hendrycksTest-jurisprudence | 1|acc | 0.08|± |0.0554|
|hendrycksTest-logical_fallacies | 1|acc | 0.52|± |0.1020|
|hendrycksTest-machine_learning | 1|acc | 0.12|± |0.0663|
|hendrycksTest-management | 1|acc | 0.12|± |0.0663|
|hendrycksTest-marketing | 1|acc | 0.16|± |0.0748|
|hendrycksTest-medical_genetics | 1|acc | 0.12|± |0.0663|
|hendrycksTest-miscellaneous | 1|acc | 0.36|± |0.0980|
|hendrycksTest-moral_disputes | 1|acc | 0.08|± |0.0554|
|hendrycksTest-moral_scenarios | 1|acc | 0.44|± |0.1013|
|hendrycksTest-nutrition | 1|acc | 0.32|± |0.0952|
|hendrycksTest-philosophy | 1|acc | 0.44|± |0.1013|
|hendrycksTest-prehistory | 1|acc | 0.16|± |0.0748|
|hendrycksTest-professional_accounting | 1|acc | 0.28|± |0.0917|
|hendrycksTest-professional_law | 1|acc | 0.12|± |0.0663|
|hendrycksTest-professional_medicine | 1|acc | 0.40|± |0.1000|
|hendrycksTest-professional_psychology | 1|acc | 0.24|± |0.0872|
|hendrycksTest-public_relations | 1|acc | 0.08|± |0.0554|
|hendrycksTest-security_studies | 1|acc | 0.24|± |0.0872|
|hendrycksTest-sociology | 1|acc | 0.28|± |0.0917|
|hendrycksTest-us_foreign_policy | 1|acc | 0.24|± |0.0872|
|hendrycksTest-virology | 1|acc | 0.20|± |0.0816|
|hendrycksTest-world_religions | 1|acc | 0.16|± |0.0748|
## Limitations and Bias
This model is not suitable for all use cases due to its limited training time on a weak computer. As a result, it may produce irrelevant or nonsensical responses. For optimal performance, I recommend using a GPU with at least 16 GB of VRAM and downloading the model manually instead of using the Transformers library. Here's how you should deploy the model:
```python
import torch
from transformers import GPT2LMHeadModel, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("Locutusque/gpt2-xl-conversational")
model = GPT2LMHeadModel.from_pretrained("Locutusque/gpt2-xl-conversational", torch_dtype=torch.float16)
model.resize_token_embeddings(len(tokenizer))
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device, dtype=torch.float32)
def generate_text(model: SENTIAForCausalLM, tokenizer, prompt, max_length=256):
prompt = f'<|USER|> {prompt} <|ASSISTANT|> '
input_ids = tokenizer.encode(prompt, add_special_tokens=True, max_length=max_length, truncation=True, return_tensors="pt").to(device)
output = model.generate(input_ids, do_sample=True, temperature=0.3, top_p=0.7, top_k=23, repetition_penalty=1.176, max_length=max_length, pad_token_id=tokenizer.pad_token_id, eos_token_id=tokenizer.eos_token_id)
output_ids = tokenizer.decode(output[0], skip_special_tokens=False)
return output_ids
# Loop to interact with the model
while True:
prompt = input("Enter a prompt (or 'q' to quit): ")
if prompt == "q":
break
output_text = generate_text(model, tokenizer, prompt, max_length=1022)
print(output_text)
```
## Deploying and training the model
The model has been fine-tuned on a specific input format that goes like this ```"<|USER|> {user prompt} <|ASSISTANT|> {model prediction} ".``` |