deeplang-ai
/

LingoWhale-8B

@@ -70,12 +70,13 @@ LingoWhale-8B模型对学术研究完全开放，使用方通过邮件申请并
 | **GPT-4**              |   68.4   |   83.9   |   70.3   |   66.2   |    69.5     |   90.0   |   75.1   |   63.3    |
 | **GPT-3.5 Turbo**      |   51.1   |   68.5   |   54.1   |   47.1   |    52.4     |   57.8   |   61.6   |   46.1    |
 | **LLaMA2-7B**          |   28.9   |   45.7   |   31.4   |   26.0   |    12.8     |   16.2   |   39.2   |   26.5    |
-| **ChatGLM2-6B**        |   50.2   |   45.9   |   49.0   |   49.4   |    9.2      |   28.9   |   31.7   |   45.3    |
-| **Baichuan2-7B-Base**  |   54.0   |   54.2   |   57.1   |   47.5   |    18.3     |   24.5   |   41.6   |   42.7    |
-| **Qwen-7B v1.1**       |   63.5   |   58.2   |   62.2   |     -    |    29.9     |   51.7   |   45.0   |     -     |
 | **LingoWhale-8B-base** |   63.6   |   60.2   |   62.8   |   50.3   |    32.9     |   55.0   |   47.5   |   43.8    |
-<span style="color:gray">对于以上所有对比模型，我们列出了其官方汇报结果并四舍五入保留一位小数。</span>
 # 生成样例

 | **GPT-4**              |   68.4   |   83.9   |   70.3   |   66.2   |    69.5     |   90.0   |   75.1   |   63.3    |
 | **GPT-3.5 Turbo**      |   51.1   |   68.5   |   54.1   |   47.1   |    52.4     |   57.8   |   61.6   |   46.1    |
 | **LLaMA2-7B**          |   28.9   |   45.7   |   31.4   |   26.0   |    12.8     |   16.2   |   39.2   |   26.5    |
+| **ChatGLM2-6B**$\ast$       |   51.7   |   47.9   |     -    |     -    |      -      |   32.4   |   33.7   |     -     |
+| **Baichuan2-7B-Base**$\ast$  |   54.0   |   54.2   |   57.1   |   47.5   |    18.3     |   24.5   |   41.6   |   42.7    |
+| **Qwen-7B v1.1**$\ast$       |   63.5   |   58.2   |   62.2   |     -    |    29.9     |   51.7   |   45.0   |     -     |
 | **LingoWhale-8B-base** |   63.6   |   60.2   |   62.8   |   50.3   |    32.9     |   55.0   |   47.5   |   43.8    |
+$\textcolor{gray}\ast$<span style="color:gray">表示其模型结果来自于官方， 所有的结果都精确到小数点后1位。 </span>
 # 生成样例

README_EN.md CHANGED Viewed

@@ -68,12 +68,13 @@ These evaluation benchmarks provide standardized tests and metrics to assess lan
 | **GPT-4**              |   68.4   |   83.9   |   70.3   |   66.2   |    69.5     |   90.0   |   75.1   |   63.3    |
 | **GPT-3.5 Turbo**      |   51.1   |   68.5   |   54.1   |   47.1   |    52.4     |   57.8   |   61.6   |   46.1    |
 | **LLaMA2-7B**          |   28.9   |   45.7   |   31.4   |   26.0   |    12.8     |   16.2   |   39.2   |   26.5    |
-| **ChatGLM2-6B**        |   50.2   |   45.9   |   49.0   |   49.4   |    9.2      |   28.9   |   31.7   |   45.3    |
-| **Baichuan2-7B-Base**  |   54.0   |   54.2   |   57.1   |   47.5   |    18.3     |   24.5   |   41.6   |   42.7    |
-| **Qwen-7B v1.1**       |   63.5   |   58.2   |   62.2   |     -    |    29.9     |   51.7   |   45.0   |     -     |
 | **LingoWhale-8B-base** |   63.6   |   60.2   |   62.8   |   50.3   |    32.9     |   55.0   |   47.5   |   43.8    |
-<span style="color:gray">For all of the above comparison models, we list their official reports and round them to one decimal place.</span>
 # Generated Examples

 | **GPT-4**              |   68.4   |   83.9   |   70.3   |   66.2   |    69.5     |   90.0   |   75.1   |   63.3    |
 | **GPT-3.5 Turbo**      |   51.1   |   68.5   |   54.1   |   47.1   |    52.4     |   57.8   |   61.6   |   46.1    |
 | **LLaMA2-7B**          |   28.9   |   45.7   |   31.4   |   26.0   |    12.8     |   16.2   |   39.2   |   26.5    |
+| **ChatGLM2-6B**$\ast$       |   51.7   |   47.9   |     -    |     -    |      -      |   32.4   |   33.7   |     -     |
+| **Baichuan2-7B-Base**$\ast$  |   54.0   |   54.2   |   57.1   |   47.5   |    18.3     |   24.5   |   41.6   |   42.7    |
+| **Qwen-7B v1.1**$\ast$       |   63.5   |   58.2   |   62.2   |     -    |    29.9     |   51.7   |   45.0   |     -     |
 | **LingoWhale-8B-base** |   63.6   |   60.2   |   62.8   |   50.3   |    32.9     |   55.0   |   47.5   |   43.8    |
+$\textcolor{gray}\ast$<span style="color:gray">indicates that the model results are from the official, and all the results are accurate to 1 decimal place. </span>
 # Generated Examples