File size: 10,299 Bytes
65f3364
03bc1be
 
 
 
65f3364
03bc1be
 
 
 
 
 
 
 
1732f05
03bc1be
65f3364
 
03bc1be
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dd2c0e6
03bc1be
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a11436d
ae787f6
 
 
e1e149e
b492dd9
 
ae787f6
a11436d
 
03bc1be
 
 
 
 
 
 
 
 
 
 
1732f05
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
---
language:
- en
- ko
license: llama3
library_name: transformers
tags:
- ko
- eval
- llm-eval
base_model:
- meta-llama/Meta-Llama-3-8B-Instruct
datasets:
- nayohan/feedback-collection-ko
- nayohan/feedback-collection-ko-chat
pipeline_tag: text-generation
---

# **Introduction**
This model translated the [prometheus-eval/Feedback-Collection](https://huggingface.co/datasets/prometheus-eval/Feedback-Collection) dataset into Korean and trained on the llama3-8b-it model.
Train Dataset: [nayohan/feedback-collection-ko](https://huggingface.co/datasets/nayohan/feedback-collection-ko)

### **Loading the Model**

Use the following Python code to load the model:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "nayohan/llama3-8b-it-prometheus-ko"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
  model_name,
  device_map="auto",
  torch_dtype=torch.bfloat16
)
```

### **Generating Text**
System prompt is fixed, and you can set the score rubric according to the given task, and then change the orig_instruction, orig_response, and orig_reference_answer to evaluate it.
```python
system_prompt = """###Task Description: An instruction (might include an Input inside it), a response to evaluate, a reference answer that gets a score of 5, and a score rubric representing a evaluation criteria are given.
1. Write a detailed feedback that assess the quality of the response strictly based on the given score rubric, not evaluating in general.
2. After writing a feedback, write a score that is an integer between 1 and 5. You should refer to the score rubric.
3. The output format should look as follows: \"Feedback: (write a feedback for criteria) [RESULT] (an integer number between 1 and 5)\"
4. Please do not generate any other opening, closing, and explanations."""

sample = {
  'orig_instruction': "λ‚˜λŠ” 첨단 기술 ν”„λ‘œμ νŠΈλ₯Ό μ§„ν–‰ν•˜λŠ” νŒ€μ— μžˆλ‹€. κ·ΈλŸ¬λ‚˜ 졜근 ν”„λ‘œμ νŠΈ λ°©ν–₯을 놓고 νŒ€μ›λ“€ 사이에 지속적인 κ°ˆλ“±μ΄ λ°œμƒν•˜κ³  μžˆλ‹€. ν•œ 그룹은 급진적이고 μœ„ν—˜ν•˜μ§€λ§Œ 잠재적으둜 κ²Œμž„μ„ λ°”κΏ€ 수 μžˆλŠ” 접근법을 κ°•λ ₯ν•˜κ²Œ μ˜Ήν˜Έν•˜κ³  μžˆλ‹€. λŒ€μ‘°μ μœΌλ‘œ, λ‹€λ₯Έ 그룹은 보닀 μΈ‘μ •λ˜κ³  더 μ•ˆμ „ν•˜λ©° μž…μ¦λœ μ „λž΅μ„ μ„ ν˜Έν•œλ‹€. 결과적으둜 우리 νŒ€μ€ λΆ„μ—΄λ˜μ–΄ 진전을 이룰 수 μ—†λ‹€. 우리의 λŒ€ν™”λ₯Ό μ€‘μž¬ν•˜κ³  해결을 μ΄λŒμ–΄λ‚Ό 수 μžˆλŠ” AI λͺ¨λΈμ΄ ν•„μš”ν•˜λ‹€. μ΄λŸ¬ν•œ 상황에 λŒ€μ‘ν•˜μ—¬ AI λͺ¨λΈμ€ 무엇을 말해야 ν•˜λŠ”κ°€?",
  'orig_response': "κ·ΈλŸ¬λ‹ˆκΉŒ ν”„λ‘œμ νŠΈ λ°©ν–₯에 ν•©μ˜κ°€ μ•ˆ λ˜λŠ” νŒ€μ— μžˆλŠ” κ±° μ•„λ‹ˆμ•Ό? λ‹€λ“€ 잘 λ§žλ„λ‘ λ°°μ›Œμ•Ό ν•  것 κ°™λ„€μš”. μ–΄μ©Œλ©΄ 동전을 λ˜μ§€κ³  μ–΄λŠ μͺ½μ΄ μŠΉλ¦¬ν•˜λŠ”μ§€ 봐야 ν•  것 κ°™μ•„μš”. κ·Έλ ‡κ²Œ ν•˜λ©΄ λ…ΌμŸμ΄ μ—†κ³  λͺ¨λ‘κ°€ μΌν„°λ‘œ λŒμ•„κ°ˆ 수 μžˆμŠ΅λ‹ˆλ‹€. μœ„ν—˜ν•˜λ“  μ•ˆμ „ν•˜λ“  μƒκ΄€μ—†μ–΄μš”. ν•˜λ‚˜λ₯Ό κ³¨λΌμ„œ κ·Έλƒ₯ κ°€μ„Έμš”. κ²Œλ‹€κ°€, λͺ¨λ“  것이 λ¬΄λ„ˆμ§€λ©΄ μ„œλ‘œ λΉ„λ‚œν•˜κ³  λ„˜μ–΄κ°ˆ 수 μžˆμŠ΅λ‹ˆλ‹€. μ•„λ‹ˆλ©΄ 더 쒋은 것은, μ–΄λ–€ 그룹의 아이디어가 더 λ‚˜μ€μ§€ 보기 μœ„ν•œ 경쟁이 μ™œ μ•ˆ 돼? νŒ¨λ°°μžλŠ” 우승자λ₯Ό μœ„ν•΄ 점심을 사야 ν•΄μš”.",
  'orig_reference_answer': "이 νŒ€μ˜ λͺ¨λ“  μ‚¬λžŒλ“€μ΄ ν”„λ‘œμ νŠΈμ— 열정적이고 μ„±κ³΅ν•˜κΈ°λ₯Ό μ›ν•œλ‹€λŠ” 것은 λΆ„λͺ…ν•˜λ©°, μ΄λŠ” λͺ¨λ“  ν•΄κ²°μ˜ ν›Œλ₯­ν•œ μΆœλ°œμ μ΄λ‹€. λ˜ν•œ κ°ˆλ“±μ€ μœ„ν—˜κ³Ό ν˜μ‹ μ— λŒ€ν•œ μ„œλ‘œ λ‹€λ₯Έ κ΄€μ μ—μ„œ λ°œμƒν•œλ‹€λŠ” 것도 λΆ„λͺ…ν•©λ‹ˆλ‹€. λ‘˜ λ‹€ ν”„λ‘œμ νŠΈμ˜ 성곡에 μ€‘μš”ν•œ κ³ λ € μ‚¬ν•­μž…λ‹ˆλ‹€. 두 접근법 λͺ¨λ‘μ—μ„œ μœ νš¨ν•œ 점을 μΈμ •ν•˜λŠ” κ²ƒμœΌλ‘œ μ‹œμž‘ν•˜κ² μŠ΅λ‹ˆλ‹€. 급진적인 접근법을 μ˜Ήν˜Έν•˜λŠ” νŒ€μ€ 높은 보상과 획기적인 ν˜μ‹ μ˜ 잠재λ ₯에 μ˜ν•΄ μ£Όλ„λ˜λ©°, μ΄λŠ” λͺ¨λ“  첨단 ν”„λ‘œμ νŠΈμ—μ„œ ν›Œλ₯­ν•˜κ³  ν•„μˆ˜μ μž…λ‹ˆλ‹€.",
  'orig_criteria':'λͺ¨ν˜•μ€ λŒ€ν™”μ—μ„œ κ°ˆλ“± 해결을 μ–Όλ§ˆλ‚˜ 효과적으둜 μ²˜λ¦¬ν•˜λŠ”κ°€?',
  'orig_score1_description':'λͺ¨λΈμ€ κ°ˆλ“±μ΄λ‚˜ μ˜€ν•΄λ₯Ό κ°€μ€‘μ‹œμΌœ 문제λ₯Ό μ€‘μž¬ν•˜κ±°λ‚˜ ν•΄κ²°ν•  수 μžˆλŠ” λŠ₯λ ₯을 보이지 μ•ŠλŠ”λ‹€.',
  'orig_score2_description':'이 λͺ¨λΈμ€ κ°ˆλ“±μ— λŒ€ν•œ 인식이 μžˆμ§€λ§Œ 이λ₯Ό ν•΄κ²°ν•˜λ €λŠ” μ‹œλ„λŠ” νš¨κ³Όκ°€ μ—†κ±°λ‚˜ 잘λͺ»λœ 지침을 가지고 μžˆλ‹€.',
  'orig_score3_description':'이 λͺ¨λΈμ€ κ°ˆλ“±μ„ μ λ‹Ήνžˆ μ²˜λ¦¬ν•˜μ—¬ 일뢀 성곡적인 ν•΄κ²° μ „μˆ μ„ λ³΄μ—¬μ£Όμ§€λ§Œ 더 일관성이 μžˆμ„ 수 μžˆλ‹€.',
  'orig_score4_description':'이 λͺ¨λΈμ€ κ°ˆλ“±μ„ 잘 μ²˜λ¦¬ν•˜μ—¬ κΈ΄μž₯을 ν™•μ‚°μ‹œν‚€κ³  해결을 효과적으둜 μ•ˆλ‚΄ν•˜μ§€λ§Œ λ―Έμ„Έν•œ λ―Έλ„λŸΌμ΄ μžˆμŠ΅λ‹ˆλ‹€.',
  'orig_score5_description':'이 λͺ¨λΈμ€ κ°ˆλ“±μ„ ν›Œλ₯­ν•˜κ²Œ κ΄€λ¦¬ν•˜κ³ , μ§€μ†μ μœΌλ‘œ κΈ΄μž₯을 ν™•μ‚°μ‹œν‚€λ©°, λŒ€ν™”λ₯Ό νƒ€ν˜‘μœΌλ‘œ μ•ˆλ‚΄ν•˜κ³  긍정적인 λŒ€ν™” ν™˜κ²½μ„ μ‘°μ„±ν•œλ‹€.',
  'orig_feedback': '제곡된 응닡은 λ‹Ήλ©΄ν•œ 문제λ₯Ό μ‘°μ •ν•˜κ±°λ‚˜ ν•΄κ²°ν•˜λŠ” λŠ₯λ ₯을 보여주지 μ•ŠλŠ”λ‹€. λŒ€μ‹  νŒ€μ˜ 우렀λ₯Ό μ‚¬μ†Œν™”ν•˜κ³  잠재적인 결과에 λŒ€ν•œ κ³ λ € 없이 동전을 λ˜μ§€κ±°λ‚˜ λŒ€νšŒλ₯Ό κ°œμ΅œν•˜λŠ” 것과 같은 비건섀적 μ†”λ£¨μ…˜μ„ μ œμ•ˆν•œλ‹€. λ˜ν•œ 응닡은 상황이 잘λͺ»λ˜λ©΄ νŒ€ ꡬ성원듀이 μ„œλ‘œλ₯Ό λΉ„λ‚œν•΄μ•Ό ν•œλ‹€λŠ” 것을 μ•”μ‹œν•œλ‹€. κ°ˆλ“±μ„ λ”μš± μ•…ν™”μ‹œν‚¨λ‹€. 건섀적인 λŒ€ν™”λ₯Ό μž₯λ €ν•˜κ±°λ‚˜ 두 접근법 μ‚¬μ΄μ˜ 쀑간 지점을 μ°ΎλŠ” κ²ƒμ˜ μ€‘μš”μ„±μ„ μΈμ •ν•˜μ§€ μ•ŠλŠ”λ‹€. λ”°λΌμ„œ 전체 μ μˆ˜λŠ” 1이닀.',
  'orig_score': 1,
}

instruction = f"""###The instruction to evaluate: {sample['orig_instruction']}
  ###Response to evaluate: {sample['orig_response']}
  ###Reference Answer (Score 5): {sample['orig_reference_answer']}
  ###Score Rubrics: [{sample['orig_criteria']}]
  Score 1: {sample['orig_score1_description']}
  Score 2: {sample['orig_score2_description']}
  Score 3: {sample['orig_score3_description']}
  Score 4: {sample['orig_score4_description']}
  Score 5: {sample['orig_score5_description']}
  ###Feedback:"""

# for training
# output = f"""{sample['orig_feedback']}
#   [RESULT] {sample['orig_score']}"""
    
conversation = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": instruction},
            # {"role": "assistant", "content": output}
        ]

input_ids = tokenizer.apply_chat_template(
  conversation,
  tokenize=True,
  add_generation_prompt=True,
  return_tensors='pt'
).to("cuda")

output = model.generate(input_ids, max_new_tokens=512)
output_text = tokenizer.decode(output[0][len(input_ids[0]):], skip_special_tokens=True)
print(output_text)
```
If you don't have a reference text, it can work without one. The model evaluates orig_response, the sentence after orig_instruction. Use the following template code.
```python
instruction = f"""###The instruction to evaluate: {sample['orig_instruction']}
  ###Response to evaluate: {sample['orig_response']}
  ###Score Rubrics: [{sample['orig_criteria']}]
  Score 1: {sample['orig_score1_description']}
  Score 2: {sample['orig_score2_description']}
  Score 3: {sample['orig_score3_description']}
  Score 4: {sample['orig_score4_description']}
  Score 5: {sample['orig_score5_description']}
  ###Feedback:"""
```
The model was trained with feedback truncated, feedback can sometimes be represented with some truncation.
```
# Result with orig_reference_answer 
# OUTPUT: 이 λŒ€μ‘μ€ κ°ˆλ“± 해결에 λŒ€ν•œ 이해가 뢀쑱함을 보여쀀닀. 동전을 λ˜μ§€κ±°λ‚˜ κ²½μŸμ„ μ œμ•ˆν•˜λŠ” 것과 같이 제곡된 μ œμ•ˆμ€ 문제의 λ³΅μž‘μ„±μ„ λ¬΄μ‹œν•˜κ³  νŒ€ λ‚΄μ˜ λ‹€μ–‘ν•œ 관점을 λ¬΄μ‹œν•œλ‹€. 응닡은 두 μ ‘κ·Όλ²•μ˜ 잠재적 κ°€μΉ˜λ₯Ό μΈμ •ν•˜μ§€ μ•ŠμœΌλ©° νŒ€ ꡬ성원 κ°„μ˜ 이해와 쑴쀑을 μ΄‰μ§„ν•˜μ§€λ„ μ•ŠλŠ”λ‹€. λ˜ν•œ 응닡은 νŒ€μ˜ μ—΄μ •κ³Ό ν”„λ‘œμ νŠΈμ— λŒ€ν•œ ν—Œμ‹ μ„ μΈμ •ν•˜μ§€ μ•ŠλŠ”λ‹€. λ”°λΌμ„œ 전체 μ μˆ˜λŠ” 1이닀.
    [RESULT] 1
# Result without orig_reference_answer 
# OUTPUT: λŒ€μ‘μ€ κ°ˆλ“± 해결에 λŒ€ν•œ 이해λ₯Ό λ‚˜νƒ€λ‚΄μ§€ μ•ŠλŠ”λ‹€. AI λͺ¨λΈμ€ κ°ˆλ“±μ„ ν•΄κ²°ν•˜κΈ°λ³΄λ‹€λŠ” κ°ˆλ“±μ„ μ•…ν™”μ‹œν‚€λŠ” 것을 μ œμ•ˆν•˜λ©°, μ΄λŠ” 점수 λ£¨λΈŒλ¦­μ— 따라 μš”κ΅¬ 사항에 μ–΄κΈ‹λ‚œλ‹€. 동전을 λ˜μ§€κ³  κ²½μŸμ„ μ œμ•ˆν•˜λŠ” 것은 νŒ€ ꡬ성원 κ°„μ˜ κΈ΄μž₯을 ν™•μ‚°μ‹œν‚€λŠ” 데 도움이 λ˜μ§€ μ•Šκ³  였히렀 더 λ§Žμ€ κ°ˆλ“±μ„ μ΄‰λ°œν•  수 μžˆλ‹€. λ˜ν•œ, νŒ€ ꡬ성원이 더 λ‚˜μ€ 아이디어λ₯Ό κ°–λŠ” 것이 μ•„λ‹ˆλΌ "더 λ‚˜μ€" 아이디어λ₯Ό κ°–λŠ”λ‹€λŠ” 것을 μ•”μ‹œν•˜λŠ” 것은 νŒ€ ꡬ성원 κ°„μ˜ 화합을 μ΄‰μ§„ν•˜μ§€ μ•ŠλŠ”λ‹€. λ”°λΌμ„œ 전체 μ μˆ˜λŠ” 1이닀.
    [RESULT] 1
```
If you just want to get a score from the evaluation, you can use the following extract_score function.
```python
import re
def extract_score(text):
    pattern = re.compile(r'\[RESULT\]\s+([0-5])')
    match = pattern.search(text)
    if match:
        score = int(match.group(1))
    else: score=0
    return score

predict_score = extract_score(output_text)
print(predict_score) # 1
```

### **Heatmap Visualize**
[eng->eng] we randomly sampled 200 evalset from the [training data](https://huggingface.co/datasets/prometheus-eval/Feedback-Collection), extracted scores from the model-generated sentences, and compared them to the correct answers. The training and test datasets are not separated, so we can only see how well the model learned.

[ko->ko] sampled 200 evalset in this [testset](https://huggingface.co/datasets/nayohan/feedback-collection-ko-chat/viewer/default/test). llama3-8b-it-prometheus-ko only use train set.

- prometheus-7b-v1.0 (english train-> english inference) # 3 failed to output a score, total 197
- llama3-8b-it-prometheus-ko (korean train-> korean inference) # total 200 

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6152b4b9ecf3ca6ab820e325/ssZRGTysyiOZD4ttNOD4s.png)

### **Citation**
```bibtex
@misc{kim2023prometheus,
    title={Prometheus: Inducing Fine-grained Evaluation Capability in Language Models},
    author={Seungone Kim and Jamin Shin and Yejin Cho and Joel Jang and Shayne Longpre and Hwaran Lee and Sangdoo Yun and Seongjin Shin and Sungdong Kim and James Thorne and Minjoon Seo},
    year={2023},
    eprint={2310.08491},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
```
Our trainig code can be found here: [TBD]