mgustineli
/

llama-2-7b-gaudi-fine-tuning

Inference Endpoints

Model card Files Files and versions Community

llama-2-7b-gaudi-fine-tuning / README.md

mgustineli's picture

Update README.md

55ffd10 verified 4 months ago

|

history blame contribute delete

2.73 kB

	---
	library_name: transformers
	tags:
	- ipex
	- intel
	- gaudi
	- PEFT
	license: apache-2.0
	datasets:
	- timdettmers/openassistant-guanaco
	---

	# Model Card for Model ID

	This model card was copied from [huggingface.co/migaraa/Gaudi_LoRA_Llama-2-7b-hf](https://huggingface.co/migaraa/Gaudi_LoRA_Llama-2-7b-hf)

	This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on [timdettmers/openassistant-guanaco dataset](https://huggingface.co/datasets/timdettmers/openassistant-guanaco).


	## Model Details

	### Model Description

	This is a fine-tuned version of the [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) model using Parameter Efficient Fine Tuning (PEFT) with Low Rank Adaptation (LoRA) on the Intel Gaudi 2 AI accelerator. This model can be used for various text generation tasks including chatbots, content creation, and other NLP applications.

	- Developed by: Migara Amarasinghe
	- Model type: LLM
	- Language(s) (NLP): English
	- Finetuned from model [optional]: [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)


	## Uses

	### Direct Use

	This model can be used for text generation tasks such as:
	- Chatbots
	- Automated content creation
	- Text completion and augmentation

	### Out-of-Scope Use

	- Use in real-time applications where latency is critical
	- Use in highly sensitive domains without thorough evaluation and testing


	### Recommendations

	<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
	Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.


	## Training Details

	### Training Hyperparameters

	<!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
	- Training regime: Mixed precision training using bf16
	- Number of epochs: 3
	- Learning rate: 1e-4
	- Batch size: 16
	- Seq length: 512


	## Technical Specifications

	### Compute Infrastructure

	#### Hardware

	- Intel Gaudi 2 AI Accelerator
	- Intel(R) Xeon(R) Platinum 8368 CPU

	#### Software

	- Transformers library
	- Optimum Habana library


	## Environmental Impact

	<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	- Hardware Type: Intel Gaudi AI Accelerator
	- Hours used: < 1 hour