AraT5_Darija_to_MSA / README.md
Saidtaoussi's picture
Update README.md
21d3724 verified
metadata
library_name: transformers
tags:
  - NLP
  - Machine Translation
  - Moroccan Arabic
  - Darija
  - Modern Standard Arabic
  - MSA
  - AraT5
pipeline_tag: translation
widget:
  - text: آه، يالاه رجعت من شهر العسل ديالي في شفشاون
    example_title: Example 1
  - text: واش ممكن تعاونني؟ محتاج لمساعدة ديالك
    example_title: Example 2

Model Card for AraT5 - Moroccan Arabic to Modern Standard Arabic Translation

Model Details

Model Description

This model card presents a 🤗 transformers model designed for translating Moroccan Arabic (Darija) into Modern Standard Arabic (MSA). The model is fine-tuned from AraT5 base 1024.

  • Developed by: Said ET-TOUSY.
  • Model type: Fine-tuned language translation model
  • Language(s) (NLP): Moroccan Arabic (Darija), Modern Standard Arabic (MSA)
  • Finetuned from model : AraT5 base 1024

Direct Use

This model is intended to be used directly for translating text from Moroccan Arabic (Darija) to Modern Standard Arabic (MSA). It can be deployed in various applications requiring translation services.

Downstream Use

The model can also be fine-tuned for specific downstream tasks related to Moroccan Arabic and Modern Standard Arabic. This could include domain-specific translations or integration into larger NLP systems.

Out-of-Scope Use

While the model is designed for translation between Moroccan Arabic and Modern Standard Arabic, it may not perform well on other language pairs or tasks unrelated to translation.

Bias, Risks, and Limitations

The model's performance may be influenced by biases present in the training data, such as the representation of certain dialectal variations or cultural nuances. Additionally, the model's accuracy may vary depending on the complexity of the text being translated and the presence of out-of-vocabulary words.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. Careful evaluation of translated outputs, especially in sensitive or critical applications, is recommended. Furthermore, continuous monitoring and updating of the model with new data can help mitigate biases and improve performance over time.

How to Get Started with the Model

To get started with the model, follow the steps below:

  1. Install the transformers library.
  2. Load the pre-trained AraT5_Darija_to_MSA model fine-tuned for Moroccan Arabic to Modern Standard Arabic translation.
  3. Use the model to translate text from Moroccan Arabic to Modern Standard Arabic.
# Example code
>>> from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

>>> model_name = "Saaidtaoussi/AraT5_Darija_to_MSA"
>>> tokenizer = AutoTokenizer.from_pretrained(model_name)
>>> model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

# Example translation
>>> input_text = "آه، يالاه رجعت من شهر العسل ديالي في شفشاون"
>>> inputs = tokenizer(input_text, return_tensors="pt", padding=True)
>>> translated = model.generate(**inputs)
>>> output_text = tokenizer.decode(translated[0], skip_special_tokens=True)
>>> print(output_text)