RichardErkhov commited on
Commit
3045493
β€’
1 Parent(s): 3b3f4f8

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +175 -0
README.md ADDED
@@ -0,0 +1,175 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ EEVE-Korean-Instruct-10.8B-v1.0 - GGUF
11
+ - Model creator: https://huggingface.co/yanolja/
12
+ - Original model: https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0/
13
+
14
+
15
+ | Name | Quant method | Size |
16
+ | ---- | ---- | ---- |
17
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q2_K.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q2_K.gguf) | Q2_K | 3.77GB |
18
+ | [EEVE-Korean-Instruct-10.8B-v1.0.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.IQ3_XS.gguf) | IQ3_XS | 4.18GB |
19
+ | [EEVE-Korean-Instruct-10.8B-v1.0.IQ3_S.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.IQ3_S.gguf) | IQ3_S | 4.41GB |
20
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q3_K_S.gguf) | Q3_K_S | 4.39GB |
21
+ | [EEVE-Korean-Instruct-10.8B-v1.0.IQ3_M.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.IQ3_M.gguf) | IQ3_M | 4.56GB |
22
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q3_K.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q3_K.gguf) | Q3_K | 4.88GB |
23
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q3_K_M.gguf) | Q3_K_M | 4.88GB |
24
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q3_K_L.gguf) | Q3_K_L | 5.31GB |
25
+ | [EEVE-Korean-Instruct-10.8B-v1.0.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.IQ4_XS.gguf) | IQ4_XS | 5.47GB |
26
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q4_0.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q4_0.gguf) | Q4_0 | 5.7GB |
27
+ | [EEVE-Korean-Instruct-10.8B-v1.0.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.IQ4_NL.gguf) | IQ4_NL | 5.77GB |
28
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q4_K_S.gguf) | Q4_K_S | 5.75GB |
29
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q4_K.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q4_K.gguf) | Q4_K | 6.07GB |
30
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q4_K_M.gguf) | Q4_K_M | 6.07GB |
31
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q4_1.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q4_1.gguf) | Q4_1 | 6.32GB |
32
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q5_0.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q5_0.gguf) | Q5_0 | 6.94GB |
33
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q5_K_S.gguf) | Q5_K_S | 6.94GB |
34
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q5_K.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q5_K.gguf) | Q5_K | 7.13GB |
35
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q5_K_M.gguf) | Q5_K_M | 7.13GB |
36
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q5_1.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q5_1.gguf) | Q5_1 | 7.56GB |
37
+ | [EEVE-Korean-Instruct-10.8B-v1.0.Q6_K.gguf](https://huggingface.co/RichardErkhov/yanolja_-_EEVE-Korean-Instruct-10.8B-v1.0-gguf/blob/main/EEVE-Korean-Instruct-10.8B-v1.0.Q6_K.gguf) | Q6_K | 8.26GB |
38
+
39
+
40
+
41
+
42
+ Original model description:
43
+ ---
44
+ license: apache-2.0
45
+ tags:
46
+ - generated_from_trainer
47
+ base_model: yanolja/EEVE-Korean-10.8B-v1.0
48
+ model-index:
49
+ - name: yanolja/EEVE-Korean-Instruct-10.8B-v1.0
50
+ results: []
51
+ ---
52
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
53
+
54
+ <p align="left">
55
+ <img src="https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0/resolve/main/eeve_logo.webp" width="50%"/>
56
+ <p>
57
+
58
+ # EEVE-Korean-Instruct-10.8B-v1.0
59
+
60
+ ## Join Our Community on Discord!
61
+
62
+ If you're passionate about the field of Large Language Models and wish to exchange knowledge and insights, we warmly invite you to join our Discord server. It's worth noting that Korean is the primary language used in this server. The landscape of LLM is evolving rapidly, and without active sharing, our collective knowledge risks becoming outdated swiftly. Let's collaborate and drive greater impact together! Join us here: [Discord Link](https://discord.gg/b27bAHg95m).
63
+
64
+ ## Our Dedicated Team (Alphabetical Order)
65
+ | Research | Engineering | Product Management | UX Design |
66
+ |-----------------|-----------------|--------------------|--------------
67
+ | Myeongho Jeong | Geon Kim | Bokyung Huh | Eunsue Choi |
68
+ | Seungduk Kim | Rifqi Alfi | | |
69
+ | Seungtaek Choi | Sanghoon Han | | |
70
+ | | Suhyun Kang | | |
71
+
72
+ ## About the Model
73
+
74
+ This model is a fine-tuned version of [yanolja/EEVE-Korean-10.8B-v1.0](https://huggingface.co/yanolja/EEVE-Korean-10.8B-v1.0), which is a Korean vocabulary-extended version of [upstage/SOLAR-10.7B-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-v1.0). Specifically, we utilized Direct Preference Optimization (DPO) through the use of [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl).
75
+
76
+ For more details, please refer to our technical report: [Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models](https://arxiv.org/abs/2402.14714).
77
+
78
+ ## Prompt Template
79
+ ```
80
+ A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
81
+ Human: {prompt}
82
+ Assistant:
83
+ ```
84
+ ## How to Use it
85
+ ```python
86
+ from transformers import AutoTokenizer
87
+ from transformers import AutoModelForCausalLM
88
+
89
+ model = AutoModelForCausalLM.from_pretrained("yanolja/EEVE-Korean-Instruct-10.8B-v1.0")
90
+ tokenizer = AutoTokenizer.from_pretrained("yanolja/EEVE-Korean-Instruct-10.8B-v1.0")
91
+
92
+ prompt_template = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman: {prompt}\nAssistant:\n"
93
+ text = 'ν•œκ΅­μ˜ μˆ˜λ„λŠ” μ–΄λ””μΈκ°€μš”? μ•„λž˜ 선택지 쀑 κ³¨λΌμ£Όμ„Έμš”.\n\n(A) κ²½μ„±\n(B) λΆ€μ‚°\n(C) 평양\n(D) μ„œμšΈ\n(E) μ „μ£Ό'
94
+ model_inputs = tokenizer(prompt_template.format(prompt=text), return_tensors='pt')
95
+
96
+ outputs = model.generate(**model_inputs, max_new_tokens=256)
97
+ output_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
98
+ print(output_text)
99
+ ```
100
+
101
+ ### Example Output
102
+ ```
103
+ A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
104
+ Human: ν•œκ΅­μ˜ μˆ˜λ„λŠ” μ–΄λ””μΈκ°€μš”? μ•„λž˜ 선택지 쀑 κ³¨λΌμ£Όμ„Έμš”.
105
+
106
+ (A) κ²½μ„±
107
+ (B) λΆ€μ‚°
108
+ (C) 평양
109
+ (D) μ„œμšΈ
110
+ (E) μ „μ£Ό
111
+ Assistant:
112
+ (D) μ„œμšΈμ΄ ν•œκ΅­μ˜ μˆ˜λ„μž…λ‹ˆλ‹€. μ„œμšΈμ€ λ‚˜λΌμ˜ 뢁동뢀에 μœ„μΉ˜ν•΄ 있으며, μ •μΉ˜, 경제, λ¬Έν™”μ˜ μ€‘μ‹¬μ§€μž…λ‹ˆλ‹€. μ•½ 1,000만 λͺ…이 λ„˜λŠ” 인ꡬλ₯Ό 가진 μ„Έκ³„μ—μ„œ κ°€μž₯ 큰 λ„μ‹œ 쀑 ν•˜λ‚˜μž…λ‹ˆλ‹€. μ„œμšΈμ€ 높은 λΉŒλ”©, ν˜„λŒ€μ μΈ 인프라, ν™œκΈ° λ¬Έν™” μž₯면으둜 유λͺ…ν•©λ‹ˆλ‹€. λ˜ν•œ, λ§Žμ€ 역사적 λͺ…μ†Œμ™€ 박물관이 μžˆμ–΄ λ°©λ¬Έκ°λ“€μ—κ²Œ ν’λΆ€ν•œ λ¬Έν™” μ²΄ν—˜μ„ μ œκ³΅ν•©λ‹ˆλ‹€.
113
+ ```
114
+
115
+ ### Training Data
116
+ - Korean-translated version of [Open-Orca/SlimOrca-Dedup](https://huggingface.co/datasets/Open-Orca/SlimOrca-Dedup)
117
+ - Korean-translated version of [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned)
118
+ - No other dataset was used
119
+
120
+ ## Citation
121
+
122
+ ```
123
+ @misc{kim2024efficient,
124
+ title={Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models},
125
+ author={Seungduk Kim and Seungtaek Choi and Myeongho Jeong},
126
+ year={2024},
127
+ eprint={2402.14714},
128
+ archivePrefix={arXiv},
129
+ primaryClass={cs.CL}
130
+ }
131
+ ```
132
+ ```
133
+ @misc{cui2023ultrafeedback,
134
+ title={UltraFeedback: Boosting Language Models with High-quality Feedback},
135
+ author={Ganqu Cui and Lifan Yuan and Ning Ding and Guanming Yao and Wei Zhu and Yuan Ni and Guotong Xie and Zhiyuan Liu and Maosong Sun},
136
+ year={2023},
137
+ eprint={2310.01377},
138
+ archivePrefix={arXiv},
139
+ primaryClass={cs.CL}
140
+ }
141
+ ```
142
+ ```
143
+ @misc{SlimOrcaDedup,
144
+ title = {SlimOrca Dedup: A Deduplicated Subset of SlimOrca},
145
+ author = {Wing Lian and Guan Wang and Bleys Goodson and Eugene Pentland and Austin Cook and Chanvichet Vong and "Teknium" and Nathan Hoos},
146
+ year = {2023},
147
+ publisher = {HuggingFace},
148
+ url = {https://huggingface.co/datasets/Open-Orca/SlimOrca-Dedup/}
149
+ }
150
+ ```
151
+ ```
152
+ @misc{mukherjee2023orca,
153
+ title={Orca: Progressive Learning from Complex Explanation Traces of GPT-4},
154
+ author={Subhabrata Mukherjee and Arindam Mitra and Ganesh Jawahar and Sahaj Agarwal and Hamid Palangi and Ahmed Awadallah},
155
+ year={2023},
156
+ eprint={2306.02707},
157
+ archivePrefix={arXiv},
158
+ primaryClass={cs.CL}
159
+ }
160
+ ```
161
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
162
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_yanolja__EEVE-Korean-Instruct-10.8B-v1.0)
163
+
164
+ | Metric |Value|
165
+ |---------------------------------|----:|
166
+ |Avg. |66.48|
167
+ |AI2 Reasoning Challenge (25-Shot)|64.85|
168
+ |HellaSwag (10-Shot) |83.04|
169
+ |MMLU (5-Shot) |64.23|
170
+ |TruthfulQA (0-shot) |54.09|
171
+ |Winogrande (5-shot) |81.93|
172
+ |GSM8k (5-shot) |50.72|
173
+
174
+
175
+