ChatGPT-2.V2 / README.md

Upload tokenizer

0e24b78 verified 15 days ago

5.17 kB

	---
	base_model:
	- MBZUAI/LaMini-GPT-774M
	library_name: transformers
	license: apache-2.0
	model_name: ChatGPT-2.V2
	tags:
	- conversational-ai
	- fine-tuning
	- gpt2
	- causal-lm
	- chatbots
	---

	# ChatGPT-2.V2 Model Card

	## Model Description

	ChatGPT-2.V2 is a fine-tuned version of the lamini-gpt-774M instruction model, optimized for conversational AI tasks. The model is trained to generate coherent, context-aware responses for interactive chatbot applications, achieving significant improvements in performance through fine-tuning on a combination of public conversational datasets and curated, domain-specific datasets.

	This model supports a context length of up to 1024 tokens, enabling it to handle multi-turn conversations effectively.

	---

	## Fine-Tuning Process

	The model was fine-tuned using public conversational datasets and curated datasets specifically tailored for interactive chat scenarios. The fine-tuning process aimed to:

	- Enhance the model's ability to understand and respond to diverse conversational prompts.
	- Improve context retention and relevance in multi-turn interactions.
	- Achieve a balance between creativity and accuracy for engaging chatbot responses.

	The training process resulted in a final loss of 1.2, indicating strong convergence and performance.

	---

	## Key Features

	- Conversational Proficiency: Designed for real-time chat applications with context-aware responses.
	- Fine-Tuned Context Handling: Supports up to 1024 tokens, enabling robust multi-turn conversations.
	- Instruction-Based Foundation: Built on the lamini-gpt-774M instruction model, retaining its strengths in task-oriented dialogues.

	---

	## Training Details

	- Base Model: lamini-gpt-774M
	- Fine-Tuning Framework: Hugging Face Transformers
	- Datasets Used:
	- Public conversational datasets (open-domain)
	- Custom curated datasets for domain-specific conversations
	- Context Length: 1024 tokens
	- Final Loss: 1.2
	- Learning Rate: 1e-5
	- Training Epochs: 3
	- fp16: True

	---

	## Usage

	The model is intended for conversational AI applications, such as:

	- Chatbots for customer support
	- Interactive virtual assistants
	- Personalized conversational agents

	### Inference Example

	```python
	# Load model directly
	from transformers import AutoModelForCausalLM, GPT2Tokenizer
	import torch

	tokenizer = GPT2Tokenizer.from_pretrained("suriya7/ChatGPT-2.V2")
	model = AutoModelForCausalLM.from_pretrained("suriya7/ChatGPT-2.V2")

	prompt = """
	<\|im_start\|>system\nYou are a helpful AI assistant named Securitron, trained by Aquilax.<\|im_end\|>
	"""

	# Keep a list for the last one conversation exchanges
	conversation_history = []

	while True:
	user_prompt = input("User Question: ")
	if user_prompt.lower() == 'break':
	break

	# Format the user's input
	user = f"""<\|im_start\|>user
	{user_prompt}<\|im_end\|>"""

	# Add the user's question to the conversation history
	conversation_history.append(user)

	# Ensure conversation starts with a user's input and keep only the last 2 exchanges (4 turns)
	conversation_history = conversation_history[-5:]

	# Build the full prompt
	current_prompt = prompt + "\n".join(conversation_history)

	# Tokenize the prompt
	encodeds = tokenizer(current_prompt, return_tensors="pt", truncation=True).input_ids

	# Move model and inputs to the appropriate device
	device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
	model.to(device)
	inputs = encodeds.to(device)

	# Create an empty list to store generated tokens
	generated_ids = inputs

	# Start generating tokens one by one
	assistant_response = ""
	for _ in range(512): # Specify a max token limit for streaming
	next_token = model.generate(
	generated_ids,
	max_new_tokens=1,
	pad_token_id=50259,
	eos_token_id=50259,
	num_return_sequences=1,
	do_sample=True,
	top_k=50,
	temperature=0.2,
	top_p=0.90
	)

	generated_ids = torch.cat([generated_ids, next_token[:, -1:]], dim=1)
	token_id = next_token[0, -1].item()
	token = tokenizer.decode([token_id], skip_special_tokens=True)

	assistant_response += token
	print(token, end="", flush=True)

	if token_id == 50259: # EOS token
	break

	print()
	conversation_history.append(f"<\|im_start\|>{assistant_response.strip()}<\|im_end\|>")
	```

	## Limitations
	While the model performs well in general chat scenarios, it may encounter challenges in:

	- Highly domain-specific contexts not covered during fine-tuning.
	- Very long conversations that exceed the 1024-token context limit.

	## Additional Disclaimer

	Please note that this model has not been specifically aligned using techniques such as Direct Preference Optimization (DPO) or similar methodologies. While the model has been fine-tuned to perform well in chat-based tasks, its responses are not guaranteed to reflect human-aligned preferences or ethical guidelines. Use with caution in sensitive or critical applications.