nlp-with-deeplearning commited on
Commit
e81569f
1 Parent(s): bd3339f

update readme

Browse files
Files changed (3) hide show
  1. README.md +35 -1
  2. images/avg.png +0 -0
  3. images/enko.png +0 -0
README.md CHANGED
@@ -9,4 +9,38 @@ pipeline_tag: text2text-generation
9
  tags:
10
  - nmt
11
  - aihub
12
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  tags:
10
  - nmt
11
  - aihub
12
+ ---
13
+
14
+ # ENKO-T5-SMALL-V0
15
+
16
+ This model is for English to Korean Machine Translator, which is based on T5-small architecture, but trained from scratch.
17
+
18
+ #### Code
19
+
20
+ The training code is from my lecture([LLM을 위한 김기현의 NLP EXPRESS](https://fastcampus.co.kr/data_online_nlpexpress)), which is published on [FastCampus](https://fastcampus.co.kr/). You can check the training code in this github [repo](https://github.com/kh-kim/nlp-express-practice).
21
+
22
+ #### Dataset
23
+
24
+ The training dataset for this model is mainly from [AI-Hub](https://www.aihub.or.kr/). The dataset consists of 11M parallel samples.
25
+
26
+ #### Tokenizer
27
+
28
+ I use Byte-level BPE tokenizer for both source and target language. Since it covers both languages, tokenizer vocab size is 60k.
29
+
30
+ #### Architecture
31
+
32
+ The model architecture is based on T5-small, which is popular encoder-decoder model architecture. Please, note that this model is trained from-scratch, not fine-tuned.
33
+
34
+ #### Evaluation
35
+
36
+ I conducted the evaluation with 5 different test sets. Following figure shows BLEU scores on each test set.
37
+
38
+ ![](images/enko.png)
39
+
40
+ ![](images/avg.png)
41
+
42
+ DEEPCL model is private version of this model, which is trained on much more data.
43
+
44
+ #### Contact
45
+
46
+ Kim Ki Hyun ([email protected])
images/avg.png ADDED
images/enko.png ADDED