Edit model card

Model Card for Model ID

This model was fine-tuned from meta-llama/Meta-Llama-3-8B

Model Details

Model Description

The gopalakrishnan-d/meta-llama3-8b-alpaca-v1 model is a fine-tuned variant of the Llama3 architecture with 8 billion parameters. This version has been specifically enhanced for better performance on diverse language tasks, utilizing the Gaudi 2 Accelerator to optimize the training process.

  • Hardware Type: Intel Gaudi2 Accelerator
  • Cloud Provider: Intel® Tiber™ Developer Cloud
  • Developed by: gopalakrishnan-d
  • Model type: Fine-Tuned LLM
  • Language(s) (NLP): English
  • **License:Apache 2.0 License
  • Finetuned from model: meta-llama/Meta-Llama-3-8B

Uses

  • Customer Service Chatbots
  • Content Generation Tools
  • Educational Tutoring Systems
  • Workflow Automation Systems
  • Personalized Recommendation Engines

Training Hyperparameters

- learning_rate: 5e-06 (Low Rate)
- train_batch_size: 8
- seed: 100
- gradient_accumulation_steps: 1
- optimizer: Adam 
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.03
- lora_rank=16 
- lora_alpha=32

Evaluation

Will be update..!

Results

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train gopalakrishnan-d/guadi-meta-llama3-8b-alpaca-v1

Collection including gopalakrishnan-d/guadi-meta-llama3-8b-alpaca-v1