Update README.md

45eb063 verified 9 months ago

6.5 kB

	---
	license: apache-2.0
	language:
	- en
	datasets:
	- mlabonne/guanaco-llama2-1k
	pipeline_tag: question-answering
	tags:
	- llm
	- fine-tuned
	- Llama 2 7b
	- KiwiTech LLC
	---
	# Model Card for syedzaidi-kiwi/Llama-2-7b-chat-finetune

	This model is a fine-tuned version of Meta's Llama 2 7B variant for enhanced chat functionalities.

	This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).

	## Model Details

	### Model Description

	- Developed by: Syed Asad
	- Model type: Fine-tuned Llama 2 7B variant
	- Language(s) (NLP): English
	- License: Apache-2.0
	- Finetuned from model: NousResearch/Llama-2-7b-chat-hf


	### Model Sources

	- Repository: [syedzaidi-kiwi/Llama-2-7b-chat-finetune](https://huggingface.co/syedzaidi-kiwi/Llama-2-7b-chat-finetune)
	- Paper: [https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/]


	## Uses

	### Direct Use

	The model is intended for direct use in applications requiring conversational responses, such as chatbots or virtual assistants.

	### Out-of-Scope Use

	The model is not designed for tasks outside of conversational AI, such as document summarization or translation.

	## Bias, Risks, and Limitations

	Users should be aware of potential biases in the training data and limitations in the model's understanding of nuanced human language. Further evaluation is recommended for specific use cases.


	## How to Get Started with the Model

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	tokenizer = AutoTokenizer.from_pretrained("syedzaidi-kiwi/Llama-2-7b-chat-finetune")
	model = AutoModelForCausalLM.from_pretrained("syedzaidi-kiwi/Llama-2-7b-chat-finetune")

	inputs = tokenizer("Hello, how are you?", return_tensors="pt")
	response = model.generate(**inputs)
	print(tokenizer.decode(response[0], skip_special_tokens=True))
	```


	## Training Details

	### Training Data

	The model was fine-tuned using the dataset mlabonne/guanaco-llama2-1k.

	Link: https://huggingface.co/datasets/mlabonne/guanaco-llama2-1k

	### Training Procedure

	#### Training Hyperparameters

	- Training regime:

	The model was fine-tuned using a mix of precision training techniques to balance training speed and model performance effectively.

	While the exact precision format (e.g., fp32, fp16, bf16) utilized depends on the compute capabilities available, an emphasis was placed on leveraging mixed precision (fp16) training to accelerate the training process on compatible hardware. This approach allowed for faster computation and reduced memory usage without significant loss in training quality.

	Users are encouraged to adjust the precision settings based on their hardware specifications to optimize performance further.

	#### Speeds, Sizes, Times

	To be tested by the KiwiTech Team

	## Evaluation

	### Testing Data, Factors & Metrics

	#### Testing Data

	The model's performance was evaluated on a held-out test set from the mlabonne/guanaco-llama2-1k dataset.

	This dataset comprises diverse conversational contexts to assess the model's generalization and robustness across various topics. [https://huggingface.co/datasets/mlabonne/guanaco-llama2-1k]

	#### Factors

	Evaluation focused on several key factors to ensure the model's versatility and reliability in conversational AI applications:

	Context understanding: The model's ability to maintain context and coherence over long conversations.
	Diversity of responses: The variety in the model's responses to similar prompts, indicating its creative and dynamic conversational capabilities.
	Safety and bias: Monitoring for any unintended biases in responses or generation of inappropriate content.

	#### Metrics

	To comprehensively assess the model's performance, the following metrics were utilized:

	Perplexity (PPL): Lower perplexity scores indicate better understanding and generation of the text.
	BLEU Score: For measuring the similarity between the model's generated responses and a set of reference responses, indicating the model's accuracy in reproducing human-like answers.
	F1 Score: Evaluating the balance between precision and recall in the model's responses, useful for assessing conversational relevance.
	Safety and Bias Evaluation: Custom metrics were developed to quantify the model's performance in generating safe, unbiased content.

	### Results

	To be Evaulated, will be updated in this section.

	#### Summary

	The fine-tuned model demonstrates significant improvements in generating coherent, diverse, and contextually appropriate responses across various conversational settings.

	It represents a step forward in developing conversational AI systems that are both efficient and effective.

	Continuous evaluation and monitoring are advised to further enhance and maintain the model's performance standards.


	## Technical Specifications

	### Model Architecture and Objective

	Transformers

	### Compute Infrastructure

	T4 GPU

	#### Hardware

	Fine Tuned on Apple M3 Pro (Silicon Chip)

	#### Software

	Google Colab Notebook Used

	## Citation

	OriginalLlama2Citation
	Title: Llama 2: Open Foundation and Fine-Tuned Chat Models},
	Authors: Hugo Touvron∗ Louis Martin† Kevin Stone†
	Peter Albert Amjad Almahairi Yasmine Babaei Nikolay Bashlykov Soumya Batra
	Prajjwal Bhargava Shruti Bhosale Dan Bikel Lukas Blecher Cristian Canton Ferrer Moya Chen
	Guillem Cucurull David Esiobu Jude Fernandes Jeremy Fu Wenyin Fu Brian Fuller
	Cynthia Gao Vedanuj Goswami Naman Goyal Anthony Hartshorn Saghar Hosseini Rui Hou
	Hakan Inan Marcin Kardas Viktor Kerkez Madian Khabsa Isabel Kloumann Artem Korenev
	Punit Singh Koura Marie-Anne Lachaux Thibaut Lavril Jenya Lee Diana Liskovich
	Yinghai Lu Yuning Mao Xavier Martinet Todor Mihaylov Pushkar Mishra
	Igor Molybog Yixin Nie Andrew Poulton Jeremy Reizenstein Rashi Rungta Kalyan Saladi
	Alan Schelten Ruan Silva Eric Michael Smith Ranjan Subramanian Xiaoqing Ellen Tan Binh Tang
	Ross Taylor Adina Williams Jian Xiang Kuan Puxin Xu Zheng Yan Iliyan Zarov Yuchen Zhang
	Angela Fan Melanie Kambadur Sharan Narang Aurelien Rodriguez Robert Stojnic
	Sergey Edunov Thomas Scialom

	Journal: Gen AI, Meta
	Year: 2023

	Link to Research Paper: https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/


	## Model Card Authors

	Syed Asad

	## Model Card Contact

	Syed Asad ([email protected])