|
--- |
|
library_name: transformers |
|
tags: |
|
- gaudi |
|
- llama3 |
|
- llm |
|
- optimum-habana |
|
- text-generation-inference |
|
license: apache-2.0 |
|
datasets: |
|
- tatsu-lab/alpaca |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
|
|
# Model Card for Model ID |
|
|
|
This model was fine-tuned from meta-llama/Meta-Llama-3-8B |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
The gopalakrishnan-d/meta-llama3-8b-alpaca-v1 model is a fine-tuned variant of the Llama3 architecture with 8 billion parameters. |
|
This version has been specifically enhanced for better performance on diverse language tasks, utilizing the Gaudi 2 Accelerator to optimize the training process. |
|
|
|
- **Hardware Type:** Intel Gaudi2 Accelerator |
|
- **Cloud Provider:** Intel® Tiber™ Developer Cloud |
|
- **Developed by:** gopalakrishnan-d |
|
- **Model type:** Fine-Tuned LLM |
|
- **Language(s) (NLP):** English |
|
- **License:**Apache 2.0 License** |
|
- **Finetuned from model:** meta-llama/Meta-Llama-3-8B |
|
|
|
## Uses |
|
|
|
- Customer Service Chatbots |
|
- Content Generation Tools |
|
- Educational Tutoring Systems |
|
- Workflow Automation Systems |
|
- Personalized Recommendation Engines |
|
|
|
#### Training Hyperparameters |
|
|
|
- learning_rate: 5e-06 (Low Rate) |
|
- train_batch_size: 8 |
|
- seed: 100 |
|
- gradient_accumulation_steps: 1 |
|
- optimizer: Adam |
|
- lr_scheduler_type: linear |
|
- lr_scheduler_warmup_ratio: 0.03 |
|
- lora_rank=16 |
|
- lora_alpha=32 |
|
|
|
## Evaluation |
|
|
|
Will be update..! |
|
|
|
### Results |