Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

language: en

datasets:

  • 37 popular Python code repositories
  • See princeton-nlp/SWE-bench train split
  • See the make_datasets documentation on SWE-bench's GitHub for details on formatting input.

SWE-Llama

SWE-Llama are variants of the CodeLlama model fine-tuned on software engineering tasks extracted from real-world GitHub issues and pull requests. They were introduced and evaluated on the SWE-bench benchmark in this paper.

Model Details

  • Architecture: Transformer, based on CodeLlama architecture
  • Parameters: 7 billion for SWE-Llama-7b, 13 billion for SWE-Llama-13b
  • Objective: Generating patches to resolve GitHub issues, conditioned on issue description and code context

Training Data

SWE-Llama was fine-tuned on 19,000 issues and pull requests collected from 37 popular Python code repositories on GitHub, disjoint from those used in SWE-bench.

Training Procedure

  • Fine-tuned only the attention matrices using LoRA method
  • Trained for 4 epochs with a batch size of 32
  • Selected best checkpoint based on validation perplexity

Evaluation Results

When evaluated on the SWE-bench benchmark:

  • SWE-Llama-7b achieved 3.0% issue resolution rate using oracle context retrieval
  • SWE-Llama-13b achieved 4.0% issue resolution rate using oracle context retrieval

BibTeX Entry

@misc{jimenez2023swebench,
      title={SWE-bench: Can Language Models Resolve Real-World GitHub Issues?}, 
      author={Carlos E. Jimenez and John Yang 
        and Alexander Wettig and Shunyu Yao 
        and Kexin Pei and Ofir Press and Karthik Narasimhan},
      year={2023},
      eprint={2310.06770},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
Downloads last month
708
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using princeton-nlp/SWE-Llama-13b 2