File size: 5,319 Bytes
256a159
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
# Useful Tools

## Prompt Viewer

This tool allows you to directly view the generated prompt without starting the full training process. If the passed configuration is only the dataset configuration (such as `configs/datasets/nq/nq_gen.py`), it will display the original prompt defined in the dataset configuration. If it is a complete evaluation configuration (including the model and the dataset), it will display the prompt received by the selected model during operation.

Running method:

```bash
python tools/prompt_viewer.py CONFIG_PATH [-n] [-a] [-p PATTERN]
```

- `-n`: Do not enter interactive mode, select the first model (if any) and dataset by default.
- `-a`: View the prompts received by all models and all dataset combinations in the configuration.
- `-p PATTERN`: Do not enter interactive mode, select all datasets that match the input regular expression.

## Case Analyzer (To be updated)

Based on existing evaluation results, this tool produces inference error samples and full samples with annotation information.

Running method:

```bash
python tools/case_analyzer.py CONFIG_PATH [-w WORK_DIR]
```

- `-w`: Work path, default is `'./outputs/default'`.

## Lark Bot

Users can configure the Lark bot to implement real-time monitoring of task status. Please refer to [this document](https://open.feishu.cn/document/ukTMukTMukTM/ucTM5YjL3ETO24yNxkjN?lang=zh-CN#7a28964d) for setting up the Lark bot.

Configuration method:

- Open the `configs/secrets.py` file, and add the following line to the file:

```python
lark_bot_url = 'YOUR_WEBHOOK_URL'
```

- Normally, the Webhook URL format is like https://open.feishu.cn/open-apis/bot/v2/hook/xxxxxxxxxxxxxxxxx .

- Inherit this file in the complete evaluation configuration

- To avoid the bot sending messages frequently and causing disturbance, the running status will not be reported automatically by default. If necessary, you can start status reporting through `-l` or `--lark`:

```bash
python run.py configs/eval_demo.py -l
```

## API Model Tester

This tool can quickly test whether the functionality of the API model is normal.

Running method:

```bash
python tools/test_api_model.py [CONFIG_PATH] -n
```

## Prediction Merger

This tool can merge patitioned predictions.

Running method:

```bash
python tools/prediction_merger.py CONFIG_PATH [-w WORK_DIR]
```

- `-w`: Work path, default is `'./outputs/default'`.

## List Configs

This tool can list or search all available model and dataset configurations. It supports fuzzy search, making it convenient for use in conjunction with `run.py`.

Usage:

```bash
python tools/list_configs.py [PATTERN1] [PATTERN2] [...]
```

If executed without any parameters, it will list all model configurations in the `configs/models` and `configs/dataset` directories by default.

Users can also pass any number of parameters. The script will list all configurations related to the provided strings, supporting fuzzy search and the use of the * wildcard. For example, the following command will list all configurations related to `mmlu` and `llama`:

```bash
python tools/list_configs.py mmlu llama
```

Its output could be:

```text
+-----------------+-----------------------------------+
| Model           | Config Path                       |
|-----------------+-----------------------------------|
| hf_llama2_13b   | configs/models/hf_llama2_13b.py   |
| hf_llama2_70b   | configs/models/hf_llama2_70b.py   |
| hf_llama2_7b    | configs/models/hf_llama2_7b.py    |
| hf_llama_13b    | configs/models/hf_llama_13b.py    |
| hf_llama_30b    | configs/models/hf_llama_30b.py    |
| hf_llama_65b    | configs/models/hf_llama_65b.py    |
| hf_llama_7b     | configs/models/hf_llama_7b.py     |
| llama2_13b_chat | configs/models/llama2_13b_chat.py |
| llama2_70b_chat | configs/models/llama2_70b_chat.py |
| llama2_7b_chat  | configs/models/llama2_7b_chat.py  |
+-----------------+-----------------------------------+
+-------------------+---------------------------------------------------+
| Dataset           | Config Path                                       |
|-------------------+---------------------------------------------------|
| cmmlu_gen         | configs/datasets/cmmlu/cmmlu_gen.py               |
| cmmlu_gen_ffe7c0  | configs/datasets/cmmlu/cmmlu_gen_ffe7c0.py        |
| cmmlu_ppl         | configs/datasets/cmmlu/cmmlu_ppl.py               |
| cmmlu_ppl_fd1f2f  | configs/datasets/cmmlu/cmmlu_ppl_fd1f2f.py        |
| mmlu_gen          | configs/datasets/mmlu/mmlu_gen.py                 |
| mmlu_gen_23a9a9   | configs/datasets/mmlu/mmlu_gen_23a9a9.py          |
| mmlu_gen_5d1409   | configs/datasets/mmlu/mmlu_gen_5d1409.py          |
| mmlu_gen_79e572   | configs/datasets/mmlu/mmlu_gen_79e572.py          |
| mmlu_gen_a484b3   | configs/datasets/mmlu/mmlu_gen_a484b3.py          |
| mmlu_ppl          | configs/datasets/mmlu/mmlu_ppl.py                 |
| mmlu_ppl_ac766d   | configs/datasets/mmlu/mmlu_ppl_ac766d.py          |
+-------------------+---------------------------------------------------+
```

## Dataset Suffix Updater

This tool can quickly modify the suffixes of configuration files located under the `configs/dataset` directory, aligning them with the naming conventions based on prompt hash.

How to run:

```bash
python tools/update_dataset_suffix.py
```