metadata

base_model:
  - MBZUAI/LaMini-GPT-774M
library_name: transformers
license: apache-2.0
model_name: ChatGPT-2.V2
tags:
  - conversational-ai
  - fine-tuning
  - gpt2
  - causal-lm
  - chatbots

ChatGPT-2.V2 Model Card

Model Description

ChatGPT-2.V2 is a fine-tuned version of the lamini-gpt-774M instruction model, optimized for conversational AI tasks. The model is trained to generate coherent, context-aware responses for interactive chatbot applications, achieving significant improvements in performance through fine-tuning on a combination of public conversational datasets and curated, domain-specific datasets.

This model supports a context length of up to 1024 tokens, enabling it to handle multi-turn conversations effectively.

Fine-Tuning Process

The model was fine-tuned using public conversational datasets and curated datasets specifically tailored for interactive chat scenarios. The fine-tuning process aimed to:

Enhance the model's ability to understand and respond to diverse conversational prompts.
Improve context retention and relevance in multi-turn interactions.
Achieve a balance between creativity and accuracy for engaging chatbot responses.

The training process resulted in a final loss of 1.2, indicating strong convergence and performance.

Key Features

Conversational Proficiency: Designed for real-time chat applications with context-aware responses.
Fine-Tuned Context Handling: Supports up to 1024 tokens, enabling robust multi-turn conversations.
Instruction-Based Foundation: Built on the lamini-gpt-774M instruction model, retaining its strengths in task-oriented dialogues.

Training Details

Base Model: lamini-gpt-774M
Fine-Tuning Framework: Hugging Face Transformers
Datasets Used:
- Public conversational datasets (open-domain)
- Custom curated datasets for domain-specific conversations
Context Length: 1024 tokens
Final Loss: 1.2
Learning Rate: 1e-5
Training Epochs: 3
fp16: True

Usage

The model is intended for conversational AI applications, such as:

Chatbots for customer support
Interactive virtual assistants
Personalized conversational agents

Inference Example

# Load model directly
from transformers import AutoModelForCausalLM, GPT2Tokenizer
import torch

tokenizer = GPT2Tokenizer.from_pretrained("suriya7/ChatGPT-2.V2")
model = AutoModelForCausalLM.from_pretrained("suriya7/ChatGPT-2.V2")

prompt = """
<|im_start|>system\nYou are a helpful AI assistant named Securitron, trained by Aquilax.<|im_end|>
"""

# Keep a list for the last one conversation exchanges
conversation_history = []

while True:
    user_prompt = input("User Question: ")
    if user_prompt.lower() == 'break':
        break

    # Format the user's input
    user = f"""<|im_start|>user
{user_prompt}<|im_end|>"""

    # Add the user's question to the conversation history
    conversation_history.append(user)

    # Ensure conversation starts with a user's input and keep only the last 2 exchanges (4 turns)
    conversation_history = conversation_history[-5:]

    # Build the full prompt
    current_prompt = prompt + "\n".join(conversation_history)

    # Tokenize the prompt
    encodeds = tokenizer(current_prompt, return_tensors="pt", truncation=True).input_ids

    # Move model and inputs to the appropriate device
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    model.to(device)
    inputs = encodeds.to(device)

    # Create an empty list to store generated tokens
    generated_ids = inputs

    # Start generating tokens one by one
    assistant_response = ""
    for _ in range(512):  # Specify a max token limit for streaming
        next_token = model.generate(
            generated_ids,
            max_new_tokens=1,
            pad_token_id=50259,
            eos_token_id=50259,
            num_return_sequences=1,
            do_sample=True,
            top_k=50,
            temperature=0.2,
            top_p=0.90
        )
        
        generated_ids = torch.cat([generated_ids, next_token[:, -1:]], dim=1)
        token_id = next_token[0, -1].item()
        token = tokenizer.decode([token_id], skip_special_tokens=True)
        
        assistant_response += token
        print(token, end="", flush=True)

        if token_id == 50259:  # EOS token
            break

    print()
    conversation_history.append(f"<|im_start|>{assistant_response.strip()}<|im_end|>")

Limitations

While the model performs well in general chat scenarios, it may encounter challenges in:

Highly domain-specific contexts not covered during fine-tuning.
Very long conversations that exceed the 1024-token context limit.

Additional Disclaimer

Please note that this model has not been specifically aligned using techniques such as Direct Preference Optimization (DPO) or similar methodologies. While the model has been fine-tuned to perform well in chat-based tasks, its responses are not guaranteed to reflect human-aligned preferences or ethical guidelines. Use with caution in sensitive or critical applications.