jlpan commited on
Commit
17fe110
1 Parent(s): 8ea590b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -11
README.md CHANGED
@@ -4,23 +4,17 @@ base_model: bigcode/starcoder
4
  tags:
5
  - generated_from_trainer
6
  model-index:
7
- - name: moe_test
8
  results: []
9
  ---
10
 
11
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
- should probably proofread and complete it, then remove this comment. -->
13
 
14
- # moe_test
15
-
16
- This model is a fine-tuned version of [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) on an unknown dataset.
17
- It achieves the following results on the evaluation set:
18
- - Loss: 0.1043
19
- - Learning Rate: 0.0
20
 
21
  ## Model description
22
 
23
- More information needed
24
 
25
  ## Intended uses & limitations
26
 
@@ -28,7 +22,7 @@ More information needed
28
 
29
  ## Training and evaluation data
30
 
31
- More information needed
32
 
33
  ## Training procedure
34
 
 
4
  tags:
5
  - generated_from_trainer
6
  model-index:
7
+ - name: SteloCoder
8
  results: []
9
  ---
10
 
11
+ # moe_training
 
12
 
13
+ This is the final stage of training SteloCoder - MoE (Mixture of Experts) training. The dataset contains samples of code translation with five programming languages to python. The training/validation/testing data is processed and is souced from XLCoST dataset.
 
 
 
 
 
14
 
15
  ## Model description
16
 
17
+ The final model is named SteloCoder, a model designed for code machine translation from multiple languages (C++, C#, Java, JavaScript, PHP) to Python. It is based on StarCoder to which we have added additional parameters using LoRA and MoE methods.
18
 
19
  ## Intended uses & limitations
20
 
 
22
 
23
  ## Training and evaluation data
24
 
25
+ The data is processed sourced from XLCoST dataset.
26
 
27
  ## Training procedure
28