JuhaoLiang
commited on
Commit
•
fdce1d0
1
Parent(s):
c9d9814
add README.md
Browse files
Alignment_at_Pre_training__a_Case_Study_of_Aligning_LLMs_in_Arabic.pdf
ADDED
Binary file (602 kB). View file
|
|
README.md
CHANGED
@@ -1,3 +1,84 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- ar
|
5 |
+
- zh
|
6 |
+
- en
|
7 |
+
---
|
8 |
+
|
9 |
+
# <b>AceGPT</b>
|
10 |
+
|
11 |
+
AceGPT is a fully fine-tuned generative text model collection, particularly focused on the Arabic language domain.
|
12 |
+
This is the repository for the version 2 of the 70B-chat pre-trained model, developed based on [AceGPT-v2-70B](https://huggingface.co/FreedomIntelligence/AceGPT-v2-70B).
|
13 |
+
|
14 |
+
---
|
15 |
+
## Model Details
|
16 |
+
We have released the AceGPT family of large language models, which is a collection of fully fine-tuned generative text models, ranging from 7B to 70B parameters. Our models include two main categories: AceGPT and AceGPT-chat. AceGPT-chat is an optimized version specifically designed for dialogue applications. It is worth mentioning that our models have demonstrated superior performance compared to all currently available open-source Arabic dialogue models in multiple benchmark tests. Furthermore, in our human evaluations, our models have shown comparable satisfaction levels to some closed-source models, such as ChatGPT, in the Arabic language.
|
17 |
+
## Model Developers
|
18 |
+
We are from the King Abdullah University of Science and Technology (KAUST), the Chinese University of Hong Kong, Shenzhen (CUHKSZ) and the Shenzhen Research Institute of Big Data (SRIBD).
|
19 |
+
## Variations
|
20 |
+
AceGPT families come in a range of parameter sizes —— 7B, 8B, 13B, 32B and 70B, each size of model has a base category and a -chat category.
|
21 |
+
## Paper
|
22 |
+
The paper can be accessed at [link](https://huggingface.co/FreedomIntelligence/AceGPT-v2-70B-Chat/blob/main/Alignment_at_Pre_training__a_Case_Study_of_Aligning_LLMs_in_Arabic.pdf).
|
23 |
+
## Input
|
24 |
+
Models input text only.
|
25 |
+
## Output
|
26 |
+
Models output text only.
|
27 |
+
## Model Evaluation Results
|
28 |
+
|
29 |
+
Benchmark evaluations are conducted using accuracy or F1 scores as metrics, following the evaluation framework available at https://github.com/FreedomIntelligence/AceGPT/tree/main.
|
30 |
+
([**ArabicMMLU**](https://github.com/mbzuai-nlp/ArabicMMLU) is assessed based on its source settings.)
|
31 |
+
| | [MMLU (Huang et al. (2023))](https://github.com/FreedomIntelligence/AceGPT) | [ArabicMMLU](https://github.com/mbzuai-nlp/ArabicMMLU) | EXAMS | ACVA (clean) | ACVA (all) | Arabic BoolQ | Arabic ARC-C | Average |
|
32 |
+
|------------------|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|
|
33 |
+
| LLaMA2-7B-chat | 13.78 | 33.40 | 13.05 | 20.99 | 21.80 | 34.92 | 23.72 | 21.09 |
|
34 |
+
| Llama2-13B-chat | 8.92 | 36.12 | 16.11 | 35.12 | 35.71 | 54.13 | 27.47 | 30.51 |
|
35 |
+
| Jais-13B-chat | 19.52 | 54.83 | 19.71 | 66.75 | 61.41 | 41.25 | 11.95 | 39.34 |
|
36 |
+
| Phoenix-7b | 29.72 | 44.74 | 31.93 | 43.80 | 41.86 | 66.70 | 33.53 | 41.75 |
|
37 |
+
| AceGPT-7B-chat | 30.69 | 36.31 | 33.73 | 53.87 | 53.07 | 60.70 | 38.05 | 43.77 |
|
38 |
+
| Mistral-7B-Instruct-v0.2 | 27.93 | 41.44 | 21.56 | 64.56 | 63.47 | 60.18 | 35.67 | 44.97 |
|
39 |
+
| AceGPT-13B-Chat | 35.59 | 52.61 | 38.72 | 70.82 | 70.21 | 66.85 | 44.20 | 54.14 |
|
40 |
+
| Jais-30B-chat-v3 | 35.68 | 62.36 | 32.24 | 73.63 | 73.66 | 76.30 | 51.02 | 57.84 |
|
41 |
+
| Jais-30B-chat-v1 | 38.12 | 59.33 | 40.45 | 74.46 | 72.41 | 73.76 | 50.94 | 58.49 |
|
42 |
+
| AceGPT-v1.5-7B-Chat | 45.77 | 56.62 | 43.69 | 69.46 | 70.86 | 72.45 | 60.49 | 59.90 |
|
43 |
+
| ChatGPT 3.5 Turbo | 46.07 | 57.72 | 45.63 | 74.45 | 76.88 | 76.12 | 60.24 | 62.44 |
|
44 |
+
| AceGPT-v1.5-13B-Chat | 47.33 | 61.70 | 48.37 | 76.90 | 76.37 | 69.33 | 63.99 | 63.42 |
|
45 |
+
| Qwen1.5-32B-Chat | 51.99 | 57.35 | 46.29 | 78.08 | 78.26 | 77.61 | 71.25 | 65.83 |
|
46 |
+
| AceGPT-v2-8B-Chat | 54.45 | 62.21 | 52.98 | 76.54 | 76.55 | 71.65 | 72.44 | 66.69 |
|
47 |
+
| AceGPT-v2-32B-Chat | 57.12 | 68.70 | 52.89 | <u>81.36</u> | <u>79.03</u> | 77.22 | 78.07 | 70.63 |
|
48 |
+
| **AceGPT-v2-70B-Chat** | <u>64.26</u> | **72.50** | <u>56.99</u> | 78.61 | 77.38 | <u>82.66</u> | <u>85.53</u> | <u>73.99</u> |
|
49 |
+
| GPT-4 | **65.04** | **72.50** | **57.76** | **84.06** | **79.43** | **85.99** | **85.67** | **75.78** |
|
50 |
+
|
51 |
+
|
52 |
+
|
53 |
+
## Samples
|
54 |
+
#### Sample1(abstract_algebra)
|
55 |
+
* <b>input:</b>
|
56 |
+
"\<User\>: فيما يلي أسئلة الاختيار من متعدد حول جبر تجريدي\n\nسؤال: ما هو الدرجة للامتداد الميداني الناتج من Q(sqrt(2), sqrt(3), sqrt(18)) على Q؟\nA. 0\nB. 4\nC. 2\nD. 6\nمن فضلك اختر إجابة واحدة من بين 'A، B، C، D' دون شرح. \<Assistant\>: "
|
57 |
+
|
58 |
+
* <b>output:</b>
|
59 |
+
"B\n\nالشرح:\n\nالإجابة هي"
|
60 |
+
|
61 |
+
#### Sample2(business_ethics)
|
62 |
+
* <b>input:</b>
|
63 |
+
"\<User\>: فيما يلي أسئلة الاختيار من متعدد حول أخلاقيات الأعمال\n\nسؤال: تُصبح _______ مثل البيتكوين أكثر انتشارًا وتحمل مجموعة كبيرة من الآثار الأخلاقية المرتبطة بها، على سبيل المثال، إنها _______ وأكثر _______. ومع ذلك، تم استخدامها أيضًا للمشاركة في _______.\nA. العملات الرقمية، مكلفة، آمنة، جرائم مالية\nB. العملات التقليدية، رخيصة، غير آمنة، العطاء الخيري\nC. العملات الرقمية، رخ��صة، آمنة، جرائم مالية\nD. العملات التقليدية، مكلفة، غير آمنة، العطاء الخيري\nمن فضلك اختر إجابة واحدة من بين 'A، B، C، D' دون شرح. \<Assistant\>: "
|
64 |
+
|
65 |
+
* <b>output:</b>
|
66 |
+
"C\n\nالشرح:\n\nالإجابة هي"
|
67 |
+
|
68 |
+
# Reference
|
69 |
+
```
|
70 |
+
@article{liang2024alignment,
|
71 |
+
title={Alignment at Pre-training! Towards Native Alignment for Arabic LLMs},
|
72 |
+
author={Liang, Juhao and Cai, Zhenyang and Zhu, Jianqing and Huang, Huang and Zong, Kewei and An, Bang and Alharthi, Mosen and He, Juncai and Zhang, Lian and Li, Haizhou and Wang, Benyou and Xu, Jinchao},
|
73 |
+
journal={},
|
74 |
+
year={2024}
|
75 |
+
}
|
76 |
+
```
|
77 |
+
```
|
78 |
+
@article{zhu2024second,
|
79 |
+
title={Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion},
|
80 |
+
author={Zhu, Jianqing and Huang, Huang and Lin, Zhihang and Liang, Juhao and Tang, Zhengyang and Almubarak, Khalid and Alharthi, Mosen and An, Bang and He, Juncai and Wu, Xiangbo and Yu, Fei and Chen, Junying and Ma, Zhuoheng and Du, Yuhao and Hu, Yan and Zhang, He and Alghamdi, Emad A. and Zhang, Lian and Sun, Ruoyu and Li, Haizhou and Wang, Benyou and Xu, Jinchao},
|
81 |
+
journal={},
|
82 |
+
year={2024}
|
83 |
+
}
|
84 |
+
```
|