license: apache-2.0
language:
- en
datasets:
- mlabonne/guanaco-llama2-1k
pipeline_tag: question-answering
tags:
- llm
- fine-tuned
- Llama 2 7b
- KiwiTech LLC
Model Card for syedzaidi-kiwi/Llama-2-7b-chat-finetune
This model is a fine-tuned version of Meta's Llama 2 7B variant for enhanced chat functionalities.
This modelcard aims to be a base template for new models. It has been generated using this raw template.
Model Details
Model Description
- Developed by: Syed Asad
- Model type: Fine-tuned Llama 2 7B variant
- Language(s) (NLP): English
- License: Apache-2.0
- Finetuned from model: NousResearch/Llama-2-7b-chat-hf
Model Sources
- Repository: syedzaidi-kiwi/Llama-2-7b-chat-finetune
- Paper: [https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/]
Uses
Direct Use
The model is intended for direct use in applications requiring conversational responses, such as chatbots or virtual assistants.
Out-of-Scope Use
The model is not designed for tasks outside of conversational AI, such as document summarization or translation.
Bias, Risks, and Limitations
Users should be aware of potential biases in the training data and limitations in the model's understanding of nuanced human language. Further evaluation is recommended for specific use cases.
How to Get Started with the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("syedzaidi-kiwi/Llama-2-7b-chat-finetune")
model = AutoModelForCausalLM.from_pretrained("syedzaidi-kiwi/Llama-2-7b-chat-finetune")
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
response = model.generate(**inputs)
print(tokenizer.decode(response[0], skip_special_tokens=True))
Training Details
Training Data
The model was fine-tuned using the dataset mlabonne/guanaco-llama2-1k.
Link: https://huggingface.co/datasets/mlabonne/guanaco-llama2-1k
Training Procedure
Training Hyperparameters
- Training regime:
The model was fine-tuned using a mix of precision training techniques to balance training speed and model performance effectively.
While the exact precision format (e.g., fp32, fp16, bf16) utilized depends on the compute capabilities available, an emphasis was placed on leveraging mixed precision (fp16) training to accelerate the training process on compatible hardware. This approach allowed for faster computation and reduced memory usage without significant loss in training quality.
Users are encouraged to adjust the precision settings based on their hardware specifications to optimize performance further.
Speeds, Sizes, Times
To be tested by the KiwiTech Team
Evaluation
Testing Data, Factors & Metrics
Testing Data
The model's performance was evaluated on a held-out test set from the mlabonne/guanaco-llama2-1k dataset.
This dataset comprises diverse conversational contexts to assess the model's generalization and robustness across various topics. [https://huggingface.co/datasets/mlabonne/guanaco-llama2-1k]
Factors
Evaluation focused on several key factors to ensure the model's versatility and reliability in conversational AI applications:
Context understanding: The model's ability to maintain context and coherence over long conversations. Diversity of responses: The variety in the model's responses to similar prompts, indicating its creative and dynamic conversational capabilities. Safety and bias: Monitoring for any unintended biases in responses or generation of inappropriate content.
Metrics
To comprehensively assess the model's performance, the following metrics were utilized:
Perplexity (PPL): Lower perplexity scores indicate better understanding and generation of the text. BLEU Score: For measuring the similarity between the model's generated responses and a set of reference responses, indicating the model's accuracy in reproducing human-like answers. F1 Score: Evaluating the balance between precision and recall in the model's responses, useful for assessing conversational relevance. Safety and Bias Evaluation: Custom metrics were developed to quantify the model's performance in generating safe, unbiased content.
Results
To be Evaulated, will be updated in this section.
Summary
The fine-tuned model demonstrates significant improvements in generating coherent, diverse, and contextually appropriate responses across various conversational settings.
It represents a step forward in developing conversational AI systems that are both efficient and effective.
Continuous evaluation and monitoring are advised to further enhance and maintain the model's performance standards.
Technical Specifications
Model Architecture and Objective
Transformers
Compute Infrastructure
T4 GPU
Hardware
Fine Tuned on Apple M3 Pro (Silicon Chip)
Software
Google Colab Notebook Used
Citation
OriginalLlama2Citation Title: Llama 2: Open Foundation and Fine-Tuned Chat Models}, Authors: Hugo Touvron∗ Louis Martin† Kevin Stone† Peter Albert Amjad Almahairi Yasmine Babaei Nikolay Bashlykov Soumya Batra Prajjwal Bhargava Shruti Bhosale Dan Bikel Lukas Blecher Cristian Canton Ferrer Moya Chen Guillem Cucurull David Esiobu Jude Fernandes Jeremy Fu Wenyin Fu Brian Fuller Cynthia Gao Vedanuj Goswami Naman Goyal Anthony Hartshorn Saghar Hosseini Rui Hou Hakan Inan Marcin Kardas Viktor Kerkez Madian Khabsa Isabel Kloumann Artem Korenev Punit Singh Koura Marie-Anne Lachaux Thibaut Lavril Jenya Lee Diana Liskovich Yinghai Lu Yuning Mao Xavier Martinet Todor Mihaylov Pushkar Mishra Igor Molybog Yixin Nie Andrew Poulton Jeremy Reizenstein Rashi Rungta Kalyan Saladi Alan Schelten Ruan Silva Eric Michael Smith Ranjan Subramanian Xiaoqing Ellen Tan Binh Tang Ross Taylor Adina Williams Jian Xiang Kuan Puxin Xu Zheng Yan Iliyan Zarov Yuchen Zhang Angela Fan Melanie Kambadur Sharan Narang Aurelien Rodriguez Robert Stojnic Sergey Edunov Thomas Scialom
Journal: Gen AI, Meta Year: 2023
Link to Research Paper: https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/
Model Card Authors
Syed Asad
Model Card Contact
Syed Asad ([email protected])