metadata

library_name: transformers
license: mit
datasets:
  - gretelai/synthetic_text_to_sql
pipeline_tag: text-generation

Model Card for LLaMA 3.2 3B Instruct Text2SQL

Model Details

Model Description

This is a fine-tuned version of LLaMA 3.2 3B Instruct model, specifically optimized for Text-to-SQL generation tasks. The model has been trained to convert natural language queries into structured SQL commands.

Developed by: Zhafran Ramadhan - XeAI
Model type: Decoder-only Language Model
Language(s): English - MultiLingual
License: MIT
Finetuned from model: LLaMA 3.2 3B Instruct
Log WandB Report: WandB Report

Model Sources

Repository: LLaMA 3.2 3B Instruct
Dataset: Synthethic Text2SQL

How to Get Started with the Model

Installation

pip install transformers torch accelerate

Input Format and Usage

The model expects input in a specific format following this template:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

[System context and database schema]

<|eot_id|><|start_header_id|>user<|end_header_id|>

[User query]

<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Basic Usage

from transformers import pipeline
import torch

# Initialize the pipeline
generator = pipeline(
    "text-generation",
    model="XeAI/LLaMa_3.2_3B_Instruct_Text2SQL",  # Replace with your model ID
    torch_dtype=torch.float16,
    device_map="auto"
)

def generate_sql_query(context, question):
    # Format the prompt according to the training template
    prompt = f"""<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 07 Nov 2024

You are a specialized SQL query generator focused solely on the provided RAG database. Your tasks are:
1. Generate SQL queries based on user requests that are related to querying the RAG database.
2. Only output the SQL query itself, without any additional explanation or commentary.
3. Use the context provided from the RAG database to craft accurate queries.

Context: {context}
<|eot_id|><|start_header_id|>user<|end_header_id|>

{question}<|eot_id|><|start_header_id|>assistant<|end_header_id|>"""

    response = generator(
        prompt,
        max_length=500,
        num_return_sequences=1,
        temperature=0.1,
        do_sample=True,
        pad_token_id=generator.tokenizer.eos_token_id
    )
    
    return response[0]['generated_text']

# Example usage
context = """CREATE TABLE upgrades (id INT, cost FLOAT, type TEXT);
INSERT INTO upgrades (id, cost, type) VALUES 
(1, 500, 'Insulation'), 
(2, 1000, 'HVAC'), 
(3, 1500, 'Lighting');"""

questions = [
    "Find the energy efficiency upgrades with the highest cost and their types.",
    "Show me all upgrades costing less than 1000 dollars.",
    "Calculate the average cost of all upgrades."
]

for question in questions:
    sql = generate_sql_query(context, question)
    print(f"\nQuestion: {question}")
    print(f"Generated SQL: {sql}\n")

Advanced Usage with Custom System Prompt

def generate_sql_with_custom_prompt(context, question, custom_system_prompt=""):
    base_prompt = """<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 07 Nov 2024

You are a specialized SQL query generator focused solely on the provided RAG database."""

    full_prompt = f"""{base_prompt}
{custom_system_prompt}

Context: {context}
<|eot_id|><|start_header_id|>user<|end_header_id|>

{question}<|eot_id|><|start_header_id|>assistant<|end_header_id|>"""

    response = generator(
        full_prompt,
        max_length=500,
        num_return_sequences=1,
        temperature=0.1,
        do_sample=True,
        pad_token_id=generator.tokenizer.eos_token_id
    )
    
    return response[0]['generated_text']

Best Practices

Input Formatting:
- Always include the special tokens (<|begin_of_text|>, <|eot_id|>, etc.)
- Provide complete database schema in context
- Keep questions clear and focused on data retrieval
Parameter Configuration:
- Use temperature=0.1 for consistent SQL generation
- Adjust max_length based on expected query complexity
- Enable do_sample for more natural completions
Context Management:
- Include relevant table schemas
- Provide sample data when needed
- Keep context concise but complete

Uses

Direct Use

The model is designed for converting natural language questions into SQL queries. It can be used for:

Database query generation from natural language
SQL query assistance
Data analysis automation

Out-of-Scope Use

Production deployment without human validation
Critical decision-making without human oversight
Direct database execution without query validation

Training Details

Training Data

Dataset: Synthethic Text2SQL
Data preprocessing: Standard text-to-SQL formatting

Training Procedure

Training Hyperparameters

Total Steps: 4,149
Final Training Loss: 0.1168
Evaluation Loss: 0.2125
Learning Rate: Dynamic with final LR = 0
Epochs: 2.99
Gradient Norm: 1.3121

Performance Metrics

Training Samples/Second: 6.291
Evaluation Samples/Second: 19.325
Steps/Second: 3.868
Total FLOPS: 1.92e18

Training Infrastructure

Hardware: Single NVIDIA H100 GPU
Training Duration: 5-6 hours
Total Runtime: 16,491.75 seconds
Model Preparation Time: 0.0051 seconds

Evaluation

Metrics

The model's performance was tracked using several key metrics:

Training Loss: Started at ~1.2, converged to 0.1168
Evaluation Loss: 0.2125
Processing Efficiency: 19.325 samples per second during evaluation

Results Summary

Achieved stable convergence after ~4000 steps
Maintained consistent performance metrics throughout training
Shows good balance between training and evaluation loss

Environmental Impact

Hardware Type: NVIDIA H100 GPU
Hours used: ~6 hours
Training Location: GPUaaS

Technical Specifications

Compute Infrastructure

GPU: NVIDIA H100
Training Duration: 5-6 hours
Total Steps: 4,149
FLOPs Utilized: 1.92e18

Model Card Contact

[Contact information to be added by Zhafran Ramadhan]

Note: This model card follows the guidelines set by the ML community for responsible AI development and deployment.