tiedeman commited on
Commit
160f642
1 Parent(s): aca9db9

added a note about performance issues for low-resource languages

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -1099,7 +1099,7 @@ model-index:
1099
 
1100
  ## Model Details
1101
 
1102
- Neural machine translation model for translating from unknown (deu+eng+fra+por+spa) to Multiple languages (mul).
1103
 
1104
  This model is part of the [OPUS-MT project](https://github.com/Helsinki-NLP/Opus-MT), an effort to make neural machine translation models widely available and accessible for many languages in the world. All models are originally trained using the amazing framework of [Marian NMT](https://marian-nmt.github.io/), an efficient NMT implementation written in pure C++. The models have been converted to pyTorch using the transformers library by huggingface. Training data is taken from [OPUS](https://opus.nlpl.eu/) and training pipelines use the procedures of [OPUS-MT-train](https://github.com/Helsinki-NLP/Opus-MT-train).
1105
  **Model Description:**
@@ -1131,6 +1131,8 @@ This model can be used for translation and text-to-text generation.
1131
 
1132
  Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)).
1133
 
 
 
1134
  ## How to Get Started With the Model
1135
 
1136
  A short example code:
 
1099
 
1100
  ## Model Details
1101
 
1102
+ Neural machine translation model for translating from unknown (deu+eng+fra+por+spa) to Multiple languages (mul). Note that many of the listed languages will not be well supported by the model as the training data is very limited for the majority of the languages. Translation performance varies a lot and for a large number of language pairs it will not work at all.
1103
 
1104
  This model is part of the [OPUS-MT project](https://github.com/Helsinki-NLP/Opus-MT), an effort to make neural machine translation models widely available and accessible for many languages in the world. All models are originally trained using the amazing framework of [Marian NMT](https://marian-nmt.github.io/), an efficient NMT implementation written in pure C++. The models have been converted to pyTorch using the transformers library by huggingface. Training data is taken from [OPUS](https://opus.nlpl.eu/) and training pipelines use the procedures of [OPUS-MT-train](https://github.com/Helsinki-NLP/Opus-MT-train).
1105
  **Model Description:**
 
1131
 
1132
  Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)).
1133
 
1134
+ Also note that many of the listed languages will not be well supported by the model as the training data is very limited for the majority of the languages. Translation performance varies a lot and for a large number of language pairs it will not work at all.
1135
+
1136
  ## How to Get Started With the Model
1137
 
1138
  A short example code: