uukuguy
/

speechless-sparsetral-mistral-16x7b-MoE

Text Generation

Inference Endpoints

Model card Files Files and versions Community

uukuguy commited on Feb 9

Commit

a243514

•

1 Parent(s): 094a068

Update README.md

Files changed (1) hide show

README.md +41 -1

README.md CHANGED Viewed

@@ -1,3 +1,43 @@
 ---
-license: apache-2.0
 ---

 ---
+language:
+- en
+library_name: transformers
+pipeline_tag: text-generation
+datasets:
+- jondurbin/airoboros-2.2
+- Open-Orca/OpenOrca
+- garage-bAInd/Open-Platypus
+- WizardLM/WizardLM_evol_instruct_V2_196k
+- TokenBender/python_eval_instruct_51k
+tags:
+- llama-2
+- code
+license: llama2
+model-index:
+- name: SpeechlessCoder
+  results:
+  - task:
+      type: text-generation
+    dataset:
+      type: openai_humaneval
+      name: HumanEval
+    metrics:
+    - name: pass@1
+      type: pass@1
+      value: 50.0
+      verified: false
 ---
+<p><h1> speechless-sparsetral-16x7b-MoE  </h1></p>
+speechless-sparsetral-16x7b-MoE is the MoE upgraded version of [speechless-code-mistral-7b-v1.0](https://huggingface.co/uukuguy/speechless-code-mistral-7b-v1.0). The MoE fine-tuning adopts [Parameter-Efficient Sparsity Crafting (PESC)](https://arxiv.org/abs/2401.02731), which is an efficient fine-tuning architecture that uses LoRA modules as expert models, similar to the concept of [multi-loras](https://github.com/uukuguy/multi_loras).
+Specifically, Mistral-7B-0.1 is used as the base model, with 16 experts and 4 expert outputs selected for inference. The fine-tuning dataset includes codefuse-ai/Evol-Instruction-66k to enhance the model's code generation ability. The specific datasets are as follows:
+- jondurbin/airoboros-2.2: Filter categories related to coding, reasoning and planning. 23,462 samples.
+- Open-Orca/OpenOrca: Filter the 'cot' category in 1M GPT4 dataset. 74,440 samples.
+- garage-bAInd/Open-Platypus: 100%, 24,926 samples.
+- WizardLM/WizardLM_evol_instruct_V2_196k: Coding coversation part. 30,185 samples
+- TokenBender/python_eval_instruct_51k: “python” in output .40,309 samples
+- Spider: 8,659 samples
+- codefuse-ai/Evol-Instruction-66k: 100%, 66,862 samples