Edit model card

Fine-Tuned LLaMA 3.2 1B Model

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on custom data. It has been trained to generate coherent and contextually relevant responses based on the input prompt.

Model Description

  • Model Type: LLaMA (Large Language Model for AI Assistants)
  • Architecture: Causal Language Model (LlamaForCausalLM)
  • Base Model: meta-llama/Llama-3.2-1B-Instruct
  • Fine-Tuning: Fine-tuned on domain-specific data to enhance performance on targeted tasks.
  • Intended Use: Suitable for various NLP tasks such as text generation, question answering, and code analysis.

Training Data

The model was fine-tuned on a dataset containing domain-specific examples designed to improve its understanding and generation capabilities within specific contexts. The training data included:

  • Code Samples: Various programming languages for code analysis and explanation.
  • Technical Documentation: To improve technical writing and explanation capabilities.

Training Details

  • Fine-Tuning Epochs: 5
  • Batch Size: 1 (with gradient accumulation)
  • Learning Rate: 1e-5
  • Hardware: Fine-tuned using an NVIDIA A10G on an g5.16xlarge instance.
  • Optimizer: AdamW with weight decay

Model Configuration

  • Hidden Size: 2048
  • Number of Layers: 16
  • Number of Attention Heads: 32
  • Intermediate Size: 8192

Usage

To use this model, you can either download it and run locally using the transformers library or use the Hugging Face Inference API.

Using with transformers

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the fine-tuned model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("username/your-fine-tuned-llama")
model = AutoModelForCausalLM.from_pretrained("username/your-fine-tuned-llama")

# Generate text
prompt = "What does EigenLayer do exactly?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=150, num_beams=4, temperature=0.5, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Using with the Hugging Face Inference API

You can also use the model via the Hugging Face API endpoint:

import requests

API_URL = "https://api-inference.huggingface.co/models/username/your-fine-tuned-llama"
headers = {"Authorization": "Bearer YOUR_HUGGING_FACE_API_TOKEN"}

def query(prompt):
    response = requests.post(API_URL, headers=headers, json={"inputs": prompt})
    return response.json()

print(query("Explain how EigenLayer functions."))

Limitations

  • The model may generate incorrect or biased information. Users should verify the outputs for critical applications.
  • Due to fine-tuning, there might be domain-specific biases in the generation.

Ethical Considerations

Please ensure that the outputs of this model are used responsibly. The model may generate unintended or harmful content, so it should be used with caution in sensitive applications.

Acknowledgements

This model was fine-tuned based on meta-llama/Llama-3.2-1B-Instruct. Special thanks to the open-source community and contributors to the transformers library.

Downloads last month
5
Safetensors
Model size
1.24B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.