File size: 993 Bytes
8f30a71
 
 
9545afd
 
1c5dc55
262a9a0
72f8e94
d9bce36
8d171b6
9545afd
8f30a71
9545afd
 
 
8f30a71
 
 
21a1c9e
8f30a71
6b42bde
8f30a71
 
 
 
 
 
9582612
 
 
6b42bde
9545afd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
---
license: apache-2.0
base_model: Helsinki-NLP/opus-mt-ar-en
language:
- ar
- en
pipeline_tag: translation
widget:
- text: "salam ,labas ?"
- text: " kanbghik bzaf"
---
---
license: apache-2.0
base_model: Helsinki-NLP/opus-mt-ar-en




# This model's role is to translate Daraija with Latin words or Arabizi into English. It was trained on 170,000 rows of translation examples.

This model is a fine-tuned version of [Helsinki-NLP/opus-mt-ar-en](https://huggingface.co/Helsinki-NLP/opus-mt-ar-en) on anDarija Open Dataset (DODa), an ambitious open-source project dedicated to the Moroccan dialect. With about 150,000 entries, DODa is arguably the largest open-source collaborative project for Darija <=> English translation built for Natural Language Processing purposes.



### Training hyperparameters

The following hyperparameters were used during training:
- GPU : H100 80GB SXM5
- train_batch_size: 32
- eval_batch_size: 32
- num_epochs: 5
- mixed_precision_training: True FP16 enabled