DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Abstract
The rapid development of large language models has revolutionized code intelligence in software development. However, the predominance of closed-source models has restricted extensive research and development. To address this, we introduce the DeepSeek-Coder series, a range of open-source code models with sizes from 1.3B to 33B, trained from scratch on 2 trillion tokens. These models are pre-trained on a high-quality project-level code corpus and employ a fill-in-the-blank task with a 16K window to enhance code generation and infilling. Our extensive evaluations demonstrate that DeepSeek-Coder not only achieves state-of-the-art performance among open-source code models across multiple benchmarks but also surpasses existing closed-source models like Codex and GPT-3.5. Furthermore, DeepSeek-Coder models are under a permissive license that allows for both research and unrestricted commercial use.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Magicoder: Source Code Is All You Need (2023)
- DeepSeek LLM: Scaling Open-Source Language Models with Longtermism (2024)
- WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation (2023)
- On the Effectiveness of Large Language Models in Domain-Specific Code Generation (2023)
- AST-T5: Structure-Aware Pretraining for Code Generation and Understanding (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
Models citing this paper 1
Datasets citing this paper 2
Spaces citing this paper 0
No Space linking this paper