|
--- |
|
base_model: |
|
- MBZUAI/LaMini-GPT-774M |
|
library_name: transformers |
|
license: apache-2.0 |
|
model_name: ChatGPT-2.V2 |
|
tags: |
|
- conversational-ai |
|
- fine-tuning |
|
- gpt2 |
|
- causal-lm |
|
- chatbots |
|
--- |
|
|
|
# ChatGPT-2.V2 Model Card |
|
|
|
## Model Description |
|
|
|
**ChatGPT-2.V2** is a fine-tuned version of the **lamini-gpt-774M** instruction model, optimized for conversational AI tasks. The model is trained to generate coherent, context-aware responses for interactive chatbot applications, achieving significant improvements in performance through fine-tuning on a combination of public conversational datasets and curated, domain-specific datasets. |
|
|
|
This model supports a context length of up to **1024 tokens**, enabling it to handle multi-turn conversations effectively. |
|
|
|
--- |
|
|
|
## Fine-Tuning Process |
|
|
|
The model was fine-tuned using **public conversational datasets** and **curated datasets** specifically tailored for interactive chat scenarios. The fine-tuning process aimed to: |
|
|
|
- Enhance the model's ability to understand and respond to diverse conversational prompts. |
|
- Improve context retention and relevance in multi-turn interactions. |
|
- Achieve a balance between creativity and accuracy for engaging chatbot responses. |
|
|
|
The training process resulted in a **final loss of 1.2**, indicating strong convergence and performance. |
|
|
|
--- |
|
|
|
## Key Features |
|
|
|
- **Conversational Proficiency:** Designed for real-time chat applications with context-aware responses. |
|
- **Fine-Tuned Context Handling:** Supports up to 1024 tokens, enabling robust multi-turn conversations. |
|
- **Instruction-Based Foundation:** Built on the lamini-gpt-774M instruction model, retaining its strengths in task-oriented dialogues. |
|
|
|
--- |
|
|
|
## Training Details |
|
|
|
- **Base Model:** lamini-gpt-774M |
|
- **Fine-Tuning Framework:** Hugging Face Transformers |
|
- **Datasets Used:** |
|
- Public conversational datasets (open-domain) |
|
- Custom curated datasets for domain-specific conversations |
|
- **Context Length:** 1024 tokens |
|
- **Final Loss:** 1.2 |
|
- **Learning Rate:** 1e-5 |
|
- **Training Epochs:** 3 |
|
- **fp16:** True |
|
|
|
--- |
|
|
|
## Usage |
|
|
|
The model is intended for conversational AI applications, such as: |
|
|
|
- Chatbots for customer support |
|
- Interactive virtual assistants |
|
- Personalized conversational agents |
|
|
|
### Inference Example |
|
|
|
```python |
|
# Load model directly |
|
from transformers import AutoModelForCausalLM, GPT2Tokenizer |
|
import torch |
|
|
|
tokenizer = GPT2Tokenizer.from_pretrained("suriya7/ChatGPT-2.V2") |
|
model = AutoModelForCausalLM.from_pretrained("suriya7/ChatGPT-2.V2") |
|
|
|
prompt = """ |
|
<|im_start|>system\nYou are a helpful AI assistant named Securitron, trained by Aquilax.<|im_end|> |
|
""" |
|
|
|
# Keep a list for the last one conversation exchanges |
|
conversation_history = [] |
|
|
|
while True: |
|
user_prompt = input("User Question: ") |
|
if user_prompt.lower() == 'break': |
|
break |
|
|
|
# Format the user's input |
|
user = f"""<|im_start|>user |
|
{user_prompt}<|im_end|>""" |
|
|
|
# Add the user's question to the conversation history |
|
conversation_history.append(user) |
|
|
|
# Ensure conversation starts with a user's input and keep only the last 2 exchanges (4 turns) |
|
conversation_history = conversation_history[-5:] |
|
|
|
# Build the full prompt |
|
current_prompt = prompt + "\n".join(conversation_history) |
|
|
|
# Tokenize the prompt |
|
encodeds = tokenizer(current_prompt, return_tensors="pt", truncation=True).input_ids |
|
|
|
# Move model and inputs to the appropriate device |
|
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") |
|
model.to(device) |
|
inputs = encodeds.to(device) |
|
|
|
# Create an empty list to store generated tokens |
|
generated_ids = inputs |
|
|
|
# Start generating tokens one by one |
|
assistant_response = "" |
|
for _ in range(512): # Specify a max token limit for streaming |
|
next_token = model.generate( |
|
generated_ids, |
|
max_new_tokens=1, |
|
pad_token_id=50259, |
|
eos_token_id=50259, |
|
num_return_sequences=1, |
|
do_sample=True, |
|
top_k=50, |
|
temperature=0.2, |
|
top_p=0.90 |
|
) |
|
|
|
generated_ids = torch.cat([generated_ids, next_token[:, -1:]], dim=1) |
|
token_id = next_token[0, -1].item() |
|
token = tokenizer.decode([token_id], skip_special_tokens=True) |
|
|
|
assistant_response += token |
|
print(token, end="", flush=True) |
|
|
|
if token_id == 50259: # EOS token |
|
break |
|
|
|
print() |
|
conversation_history.append(f"<|im_start|>{assistant_response.strip()}<|im_end|>") |
|
``` |
|
|
|
## Limitations |
|
While the model performs well in general chat scenarios, it may encounter challenges in: |
|
|
|
- Highly domain-specific contexts not covered during fine-tuning. |
|
- Very long conversations that exceed the 1024-token context limit. |
|
|
|
## Additional Disclaimer |
|
|
|
Please note that this model has not been specifically aligned using techniques such as Direct Preference Optimization (DPO) or similar methodologies. While the model has been fine-tuned to perform well in chat-based tasks, its responses are not guaranteed to reflect human-aligned preferences or ethical guidelines. Use with caution in sensitive or critical applications. |
|
|