Qwen
/

Qwen2.5-Coder-1.5B-Instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

feihu.hf commited on Sep 18

Commit

83c0546

•

1 Parent(s): e3b2348

update README & LICENSE

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -1,5 +1,6 @@
 ---
 license: apache-2.0
 language:
 - en
 base_model:
@@ -25,7 +26,7 @@ Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (
 - A more comprehensive foundation for real-world applications such as **Code Agents**. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies.
 - **Long-context Support** up to 128K tokens and can generate up to 8K tokens.
-**This repo contains the 1.5B Qwen2.5-Coder model**, which has the following features:
 - Type: Causal Language Models
 - Training Stage: Pretraining & Post-training
 - Architecture: transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias and tied word embeddings
@@ -33,7 +34,7 @@ Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (
 - Number of Paramaters (Non-Embedding): 1.31B
 - Number of Layers: 28
 - Number of Attention Heads (GQA): 12 for Q and 2 for KV
-- Context Length: Full 32,768 tokens and generation 8192 tokens
 For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5-coder/), [GitHub](https://github.com/QwenLM/Qwen2.5-Coder), and [Documentation](https://qwen.readthedocs.io/en/latest/).
@@ -85,6 +86,7 @@ generated_ids = [
 response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 ```
 ## Evaluation & Performance
 Detailed evaluation results are reported in this [📑 blog](https://qwenlm.github.io/blog/qwen2.5-coder/).

 ---
 license: apache-2.0
+license_link: https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct/blob/main/LICENSE
 language:
 - en
 base_model:
 - A more comprehensive foundation for real-world applications such as **Code Agents**. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies.
 - **Long-context Support** up to 128K tokens and can generate up to 8K tokens.
+**This repo contains the instruction-tuned 1.5B Qwen2.5-Coder model**, which has the following features:
 - Type: Causal Language Models
 - Training Stage: Pretraining & Post-training
 - Architecture: transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias and tied word embeddings
 - Number of Paramaters (Non-Embedding): 1.31B
 - Number of Layers: 28
 - Number of Attention Heads (GQA): 12 for Q and 2 for KV
+- Context Length: Full 32,768 tokens
 For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5-coder/), [GitHub](https://github.com/QwenLM/Qwen2.5-Coder), and [Documentation](https://qwen.readthedocs.io/en/latest/).
 response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 ```
 ## Evaluation & Performance
 Detailed evaluation results are reported in this [📑 blog](https://qwenlm.github.io/blog/qwen2.5-coder/).