lbourdois commited on
Commit
73d1611
1 Parent(s): f6cc9f7

Add multilingual to the language tag

Browse files

Hi! A PR to add multilingual to the language tag to improve the referencing.

Files changed (1) hide show
  1. README.md +40 -75
README.md CHANGED
@@ -4,157 +4,122 @@ language:
4
  - en
5
  - es
6
  - oc
 
 
7
  tags:
8
  - translation
9
  - opus-mt-tc
10
- license: cc-by-4.0
11
  model-index:
12
  - name: opus-mt-tc-big-en-cat_oci_spa
13
  results:
14
  - task:
15
- name: Translation eng-cat
16
  type: translation
17
- args: eng-cat
18
  dataset:
19
  name: flores101-devtest
20
  type: flores_101
21
  args: eng cat devtest
22
  metrics:
23
- - name: BLEU
24
- type: bleu
25
  value: 41.5
26
- - task:
27
- name: Translation eng-oci
28
- type: translation
29
- args: eng-oci
30
- dataset:
31
- name: flores101-devtest
32
- type: flores_101
33
- args: eng oci devtest
34
- metrics:
35
- - name: BLEU
36
- type: bleu
37
  value: 25.4
38
- - task:
39
- name: Translation eng-spa
40
- type: translation
41
- args: eng-spa
42
- dataset:
43
- name: flores101-devtest
44
- type: flores_101
45
- args: eng spa devtest
46
- metrics:
47
- - name: BLEU
48
- type: bleu
49
  value: 28.1
 
50
  - task:
51
- name: Translation eng-spa
52
  type: translation
53
- args: eng-spa
54
  dataset:
55
  name: news-test2008
56
  type: news-test2008
57
  args: eng-spa
58
  metrics:
59
- - name: BLEU
60
- type: bleu
61
  value: 30.0
 
62
  - task:
63
- name: Translation eng-cat
64
  type: translation
65
- args: eng-cat
66
  dataset:
67
  name: tatoeba-test-v2021-08-07
68
  type: tatoeba_mt
69
  args: eng-cat
70
  metrics:
71
- - name: BLEU
72
- type: bleu
73
  value: 47.8
74
- - task:
75
- name: Translation eng-spa
76
- type: translation
77
- args: eng-spa
78
- dataset:
79
- name: tatoeba-test-v2021-08-07
80
- type: tatoeba_mt
81
- args: eng-spa
82
- metrics:
83
- - name: BLEU
84
- type: bleu
85
  value: 57.0
 
86
  - task:
87
- name: Translation eng-spa
88
  type: translation
89
- args: eng-spa
90
  dataset:
91
  name: tico19-test
92
  type: tico19-test
93
  args: eng-spa
94
  metrics:
95
- - name: BLEU
96
- type: bleu
97
  value: 52.5
 
98
  - task:
99
- name: Translation eng-spa
100
  type: translation
101
- args: eng-spa
102
  dataset:
103
  name: newstest2009
104
  type: wmt-2009-news
105
  args: eng-spa
106
  metrics:
107
- - name: BLEU
108
- type: bleu
109
  value: 30.5
 
110
  - task:
111
- name: Translation eng-spa
112
  type: translation
113
- args: eng-spa
114
  dataset:
115
  name: newstest2010
116
  type: wmt-2010-news
117
  args: eng-spa
118
  metrics:
119
- - name: BLEU
120
- type: bleu
121
  value: 37.4
 
122
  - task:
123
- name: Translation eng-spa
124
  type: translation
125
- args: eng-spa
126
  dataset:
127
  name: newstest2011
128
  type: wmt-2011-news
129
  args: eng-spa
130
  metrics:
131
- - name: BLEU
132
- type: bleu
133
  value: 39.1
 
134
  - task:
135
- name: Translation eng-spa
136
  type: translation
137
- args: eng-spa
138
  dataset:
139
  name: newstest2012
140
  type: wmt-2012-news
141
  args: eng-spa
142
  metrics:
143
- - name: BLEU
144
- type: bleu
145
  value: 39.6
 
146
  - task:
147
- name: Translation eng-spa
148
  type: translation
149
- args: eng-spa
150
  dataset:
151
  name: newstest2013
152
  type: wmt-2013-news
153
  args: eng-spa
154
  metrics:
155
- - name: BLEU
156
- type: bleu
157
  value: 35.8
 
158
  ---
159
  # opus-mt-tc-big-en-cat_oci_spa
160
 
@@ -162,7 +127,7 @@ Neural machine translation model for translating from English (en) to Catalan, O
162
 
163
  This model is part of the [OPUS-MT project](https://github.com/Helsinki-NLP/Opus-MT), an effort to make neural machine translation models widely available and accessible for many languages in the world. All models are originally trained using the amazing framework of [Marian NMT](https://marian-nmt.github.io/), an efficient NMT implementation written in pure C++. The models have been converted to pyTorch using the transformers library by huggingface. Training data is taken from [OPUS](https://opus.nlpl.eu/) and training pipelines use the procedures of [OPUS-MT-train](https://github.com/Helsinki-NLP/Opus-MT-train).
164
 
165
- * Publications: [OPUS-MT Building open translation services for the World](https://aclanthology.org/2020.eamt-1.61/) and [The Tatoeba Translation Challenge Realistic Data Sets for Low Resource and Multilingual MT](https://aclanthology.org/2020.wmt-1.139/) (Please, cite if you use this model.)
166
 
167
  ```
168
  @inproceedings{tiedemann-thottingal-2020-opus,
@@ -226,8 +191,8 @@ for t in translated:
226
  print( tokenizer.decode(t, skip_special_tokens=True) )
227
 
228
  # expected output:
229
- # ¿Por qué quieres que Tom vaya conmigo?
230
- # Ella lo obligó a comer espinacas.
231
  ```
232
 
233
  You can also use OPUS-MT models with the transformers pipelines, for example:
@@ -237,7 +202,7 @@ from transformers import pipeline
237
  pipe = pipeline("translation", model="Helsinki-NLP/opus-mt-tc-big-en-cat_oci_spa")
238
  print(pipe(">>spa<< Why do you want Tom to go there with me?"))
239
 
240
- # expected output: ¿Por qué quieres que Tom vaya conmigo?
241
  ```
242
 
243
  ## Benchmarks
@@ -265,7 +230,7 @@ print(pipe(">>spa<< Why do you want Tom to go there with me?"))
265
 
266
  ## Acknowledgements
267
 
268
- The work is supported by the [European Language Grid](https://www.european-language-grid.eu/) as [pilot project 2866](https://live.european-language-grid.eu/catalogue/#/resource/projects/2866), by the [FoTran project](https://www.helsinki.fi/en/researchgroups/natural-language-understanding-with-cross-lingual-grounding), funded by the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme (grant agreement No 771113), and the [MeMAD project](https://memad.eu/), funded by the European Unions Horizon 2020 Research and Innovation Programme under grant agreement No 780069. We are also grateful for the generous computational resources and IT infrastructure provided by [CSC -- IT Center for Science](https://www.csc.fi/), Finland.
269
 
270
  ## Model conversion info
271
 
 
4
  - en
5
  - es
6
  - oc
7
+ - multilingual
8
+ license: cc-by-4.0
9
  tags:
10
  - translation
11
  - opus-mt-tc
 
12
  model-index:
13
  - name: opus-mt-tc-big-en-cat_oci_spa
14
  results:
15
  - task:
 
16
  type: translation
17
+ name: Translation eng-cat
18
  dataset:
19
  name: flores101-devtest
20
  type: flores_101
21
  args: eng cat devtest
22
  metrics:
23
+ - type: bleu
 
24
  value: 41.5
25
+ name: BLEU
26
+ - type: bleu
 
 
 
 
 
 
 
 
 
27
  value: 25.4
28
+ name: BLEU
29
+ - type: bleu
 
 
 
 
 
 
 
 
 
30
  value: 28.1
31
+ name: BLEU
32
  - task:
 
33
  type: translation
34
+ name: Translation eng-spa
35
  dataset:
36
  name: news-test2008
37
  type: news-test2008
38
  args: eng-spa
39
  metrics:
40
+ - type: bleu
 
41
  value: 30.0
42
+ name: BLEU
43
  - task:
 
44
  type: translation
45
+ name: Translation eng-cat
46
  dataset:
47
  name: tatoeba-test-v2021-08-07
48
  type: tatoeba_mt
49
  args: eng-cat
50
  metrics:
51
+ - type: bleu
 
52
  value: 47.8
53
+ name: BLEU
54
+ - type: bleu
 
 
 
 
 
 
 
 
 
55
  value: 57.0
56
+ name: BLEU
57
  - task:
 
58
  type: translation
59
+ name: Translation eng-spa
60
  dataset:
61
  name: tico19-test
62
  type: tico19-test
63
  args: eng-spa
64
  metrics:
65
+ - type: bleu
 
66
  value: 52.5
67
+ name: BLEU
68
  - task:
 
69
  type: translation
70
+ name: Translation eng-spa
71
  dataset:
72
  name: newstest2009
73
  type: wmt-2009-news
74
  args: eng-spa
75
  metrics:
76
+ - type: bleu
 
77
  value: 30.5
78
+ name: BLEU
79
  - task:
 
80
  type: translation
81
+ name: Translation eng-spa
82
  dataset:
83
  name: newstest2010
84
  type: wmt-2010-news
85
  args: eng-spa
86
  metrics:
87
+ - type: bleu
 
88
  value: 37.4
89
+ name: BLEU
90
  - task:
 
91
  type: translation
92
+ name: Translation eng-spa
93
  dataset:
94
  name: newstest2011
95
  type: wmt-2011-news
96
  args: eng-spa
97
  metrics:
98
+ - type: bleu
 
99
  value: 39.1
100
+ name: BLEU
101
  - task:
 
102
  type: translation
103
+ name: Translation eng-spa
104
  dataset:
105
  name: newstest2012
106
  type: wmt-2012-news
107
  args: eng-spa
108
  metrics:
109
+ - type: bleu
 
110
  value: 39.6
111
+ name: BLEU
112
  - task:
 
113
  type: translation
114
+ name: Translation eng-spa
115
  dataset:
116
  name: newstest2013
117
  type: wmt-2013-news
118
  args: eng-spa
119
  metrics:
120
+ - type: bleu
 
121
  value: 35.8
122
+ name: BLEU
123
  ---
124
  # opus-mt-tc-big-en-cat_oci_spa
125
 
 
127
 
128
  This model is part of the [OPUS-MT project](https://github.com/Helsinki-NLP/Opus-MT), an effort to make neural machine translation models widely available and accessible for many languages in the world. All models are originally trained using the amazing framework of [Marian NMT](https://marian-nmt.github.io/), an efficient NMT implementation written in pure C++. The models have been converted to pyTorch using the transformers library by huggingface. Training data is taken from [OPUS](https://opus.nlpl.eu/) and training pipelines use the procedures of [OPUS-MT-train](https://github.com/Helsinki-NLP/Opus-MT-train).
129
 
130
+ * Publications: [OPUS-MT Building open translation services for the World](https://aclanthology.org/2020.eamt-1.61/) and [The Tatoeba Translation Challenge Realistic Data Sets for Low Resource and Multilingual MT](https://aclanthology.org/2020.wmt-1.139/) (Please, cite if you use this model.)
131
 
132
  ```
133
  @inproceedings{tiedemann-thottingal-2020-opus,
 
191
  print( tokenizer.decode(t, skip_special_tokens=True) )
192
 
193
  # expected output:
194
+ # Por qu� quieres que Tom vaya conmigo?
195
+ # Ella lo oblig� a comer espinacas.
196
  ```
197
 
198
  You can also use OPUS-MT models with the transformers pipelines, for example:
 
202
  pipe = pipeline("translation", model="Helsinki-NLP/opus-mt-tc-big-en-cat_oci_spa")
203
  print(pipe(">>spa<< Why do you want Tom to go there with me?"))
204
 
205
+ # expected output: Por qu� quieres que Tom vaya conmigo?
206
  ```
207
 
208
  ## Benchmarks
 
230
 
231
  ## Acknowledgements
232
 
233
+ The work is supported by the [European Language Grid](https://www.european-language-grid.eu/) as [pilot project 2866](https://live.european-language-grid.eu/catalogue/#/resource/projects/2866), by the [FoTran project](https://www.helsinki.fi/en/researchgroups/natural-language-understanding-with-cross-lingual-grounding), funded by the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme (grant agreement No 771113), and the [MeMAD project](https://memad.eu/), funded by the European Unions Horizon 2020 Research and Innovation Programme under grant agreement No 780069. We are also grateful for the generous computational resources and IT infrastructure provided by [CSC -- IT Center for Science](https://www.csc.fi/), Finland.
234
 
235
  ## Model conversion info
236