--- base_model: unsloth/llama-3.2-1b-instruct-bnb-4bit language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - llama - trl --- # How to use? - We use Unsloth for faster inference and load the adapter: ```python from unsloth import FastLanguageModel max_seq_length = 8192 dtype = None load_in_4bit = True model, tokenizer = FastLanguageModel.from_pretrained( model_name = "patched-codes/Llama-3.2-1B-FastApply", max_seq_length = max_seq_length, dtype = dtype, load_in_4bit = load_in_4bit, ) FastLanguageModel.for_inference(model) # Enable native 2x faster inference ``` - The model works with original code and the edited code as input to generate the final updated code: ```python original_code = """import React from 'react'; import { Loader } from 'lucide-react'; interface ButtonProps { text: string; onClick?: () => void; loading?: boolean; disabled?: boolean; icon?: React.ReactNode; } const Button: React.FC = ({ text, onClick, loading = false, disabled = false, icon }) => ( ); export default Button; """ update_snippet = """interface ButtonProps { variant?: 'primary' | 'secondary' | 'danger'; size?: 'small' | 'medium' | 'large'; // ... other props } const Button: React.FC = ({ variant = 'primary', size = 'medium', // ... other props }) => ( ); """ ``` - Prepare your input following the prompt structure: ```python input_text = f""" Merge all changes from the snippet into the

 below.
- Preserve the code's structure, order, comments, and indentation exactly.
- Output only the updated code, enclosed within  and  tags.
- Do not include any additional text, explanations, placeholders, ellipses, or code fences.

{original_code}

{update_snippet}

Provide the complete updated code.
"""

messages = [
    {"role": "system", "content": "You are a coding assistant that helps merge code updates, ensuring every modification is fully integrated."},
    {"role": "user", "content": input_text.strip()},
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = True,
    add_generation_prompt = True, # Must add for generation
    return_tensors = "pt",
).to("cuda")

from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer, skip_prompt = True)
output = model.generate(input_ids = inputs, streamer = text_streamer, max_new_tokens = 8192,
                   use_cache = True, temperature = 1.5, min_p = 0.1)

response = tokenizer.decode(output[0][len(inputs[0]):])

updated_code = response.split("")[1].split("")[0]
```

# Uploaded  model

- **Developed by:** patched-codes
- **License:** apache-2.0
- **Finetuned from model :** unsloth/llama-3.2-1b-instruct-bnb-4bit

This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[](https://github.com/unslothai/unsloth)