jlpan
/

SteloCoder

@@ -4,23 +4,17 @@ base_model: bigcode/starcoder
 tags:
 - generated_from_trainer
 model-index:
-- name: moe_test
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# moe_test
-This model is a fine-tuned version of [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.1043
-- Learning Rate: 0.0
 ## Model description
-More information needed
 ## Intended uses & limitations
@@ -28,7 +22,7 @@ More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure

 tags:
 - generated_from_trainer
 model-index:
+- name: SteloCoder
   results: []
 ---
+# moe_training
+This is the final stage of training SteloCoder - MoE (Mixture of Experts) training. The dataset contains samples of code translation with five programming languages to python. The training/validation/testing data is processed and is souced from XLCoST dataset.
 ## Model description
+The final model is named SteloCoder, a model designed for code machine translation from multiple languages (C++, C#, Java, JavaScript, PHP) to Python. It is based on StarCoder to which we have added additional parameters using LoRA and MoE methods.
 ## Intended uses & limitations
 ## Training and evaluation data
+The data is processed sourced from XLCoST dataset.
 ## Training procedure