Update README_English.md
Browse files- README_English.md +3 -3
README_English.md
CHANGED
@@ -25,7 +25,7 @@ Model developed with OpenNMT for the Galician-Spanish pair using the transformer
|
|
25 |
+ Translate an input_text using the NOS-MT-gl-es model with the following command:
|
26 |
|
27 |
```bash
|
28 |
-
onmt_translate -src input_text -model NOS-MT-es
|
29 |
```
|
30 |
+ The resulting translation will be in the PATH indicated by the -output flag.
|
31 |
|
@@ -40,7 +40,7 @@ Authentic corpora are corpora produced by human translators. Synthetic corpora a
|
|
40 |
|
41 |
+ Tokenisation was performed with a modified version of the [linguakit](https://github.com/citiususc/Linguakit) tokeniser (tokenizer.pl) that does not append a new line after each token.
|
42 |
+ All BPE models were generated with the script [learn_bpe.py](https://github.com/OpenNMT/OpenNMT-py/blob/master/tools/learn_bpe.py)
|
43 |
-
+ Using the .yaml in this repository it is possible to replicate the original training process. Before training the model, please verify that the path to each target (tgt) and (src) file is correct. Once this is done, proceed as follows:
|
44 |
|
45 |
```bash
|
46 |
onmt_build_vocab -config bpe-gl-es_emb.yaml -n_sample 100000
|
@@ -53,7 +53,7 @@ You may find the parameters used for this model inside the file bpe-gl-es_emb.y
|
|
53 |
|
54 |
**Evaluation**
|
55 |
|
56 |
-
The BLEU evaluation of the models is
|
57 |
|
58 |
| GOLD 1 | GOLD 2 | FLORES | TEST-SUITE|
|
59 |
| ------------- |:-------------:| -------:|----------:|
|
|
|
25 |
+ Translate an input_text using the NOS-MT-gl-es model with the following command:
|
26 |
|
27 |
```bash
|
28 |
+
onmt_translate -src input_text -model NOS-MT-gl-es -output ./output_file.txt -replace_unk -phrase_table phrase_table-gl-es.txt -gpu 0
|
29 |
```
|
30 |
+ The resulting translation will be in the PATH indicated by the -output flag.
|
31 |
|
|
|
40 |
|
41 |
+ Tokenisation was performed with a modified version of the [linguakit](https://github.com/citiususc/Linguakit) tokeniser (tokenizer.pl) that does not append a new line after each token.
|
42 |
+ All BPE models were generated with the script [learn_bpe.py](https://github.com/OpenNMT/OpenNMT-py/blob/master/tools/learn_bpe.py)
|
43 |
+
+ Using the .yaml in this repository, it is possible to replicate the original training process. Before training the model, please verify that the path to each target (tgt) and (src) file is correct. Once this is done, proceed as follows:
|
44 |
|
45 |
```bash
|
46 |
onmt_build_vocab -config bpe-gl-es_emb.yaml -n_sample 100000
|
|
|
53 |
|
54 |
**Evaluation**
|
55 |
|
56 |
+
The BLEU evaluation of the models is a mixture of internally developed tests (gold1, gold2, test-suite) and other datasets available in Galician (Flores).
|
57 |
|
58 |
| GOLD 1 | GOLD 2 | FLORES | TEST-SUITE|
|
59 |
| ------------- |:-------------:| -------:|----------:|
|