alexmarques commited on
Commit
9feb924
1 Parent(s): 692b030

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -20,7 +20,7 @@ pipeline_tag: text-generation
20
 
21
  Compressed version of [Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b-hf) specialized for code-generation.
22
  This model was obtained by fine-tuning the Sparse Foundational model [SparseLlama-2-7b-pruned_50.2of4](https://huggingface.co/nm-testing/SparseLlama-2-7b-pruned_50.2of4) on the [evol-codealpaca-v1](https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1) dataset.
23
- [SquareHead](https://arxiv.org/abs/2310.06927) knowledge distillation is used with [Llama-2-7b-evolcodealpaca](https://huggingface.co/neuralmagic/Llama-2-7b-evolcodealpaca) as teacher.
24
  It achieves [HumanEval](https://arxiv.org/abs/2107.03374) pass@1 of 34.58%, whereas the dense [Llama-2-7b-evolcodealpaca](https://huggingface.co/neuralmagic/Llama-2-7b-evolcodealpaca) model achieves 32.03%.
25
 
26
  This model was produced as part if Neural Magic's Sparse Foundational Models initiative, and demostrates the capability of Sparse Foundational Models to transfer to the code-generation domain.
 
20
 
21
  Compressed version of [Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b-hf) specialized for code-generation.
22
  This model was obtained by fine-tuning the Sparse Foundational model [SparseLlama-2-7b-pruned_50.2of4](https://huggingface.co/nm-testing/SparseLlama-2-7b-pruned_50.2of4) on the [evol-codealpaca-v1](https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1) dataset.
23
+ [SquareHead](https://arxiv.org/abs/2310.06927) knowledge distillation was used with [Llama-2-7b-evolcodealpaca](https://huggingface.co/neuralmagic/Llama-2-7b-evolcodealpaca) as teacher.
24
  It achieves [HumanEval](https://arxiv.org/abs/2107.03374) pass@1 of 34.58%, whereas the dense [Llama-2-7b-evolcodealpaca](https://huggingface.co/neuralmagic/Llama-2-7b-evolcodealpaca) model achieves 32.03%.
25
 
26
  This model was produced as part if Neural Magic's Sparse Foundational Models initiative, and demostrates the capability of Sparse Foundational Models to transfer to the code-generation domain.