Anatoly Potapov
commited on
Commit
•
a7f8ed0
1
Parent(s):
0b3dabc
Cite Vikhrmodels
Browse files
README.md
CHANGED
@@ -73,7 +73,7 @@ This benchmark was carefully translated into Russian and measured with [LLM Judg
|
|
73 |
|
74 |
### 🏟️ [Arena](https://github.com/lm-sys/arena-hard-auto)
|
75 |
|
76 |
-
We used Russian version of Arena benchmark and [Arena Hard Auto](https://github.com/lm-sys/arena-hard-auto) codebase
|
77 |
for evaluation. As baseline model we chose gpt3.5-turbo-0125 and the judge was gpt-4-1106-preview.
|
78 |
|
79 |
<style>
|
|
|
73 |
|
74 |
### 🏟️ [Arena](https://github.com/lm-sys/arena-hard-auto)
|
75 |
|
76 |
+
We used Russian version of Arena benchmark from [Vikhrmodels](https://huggingface.co/datasets/Vikhrmodels/ru-arena-general) and [Arena Hard Auto](https://github.com/lm-sys/arena-hard-auto) codebase
|
77 |
for evaluation. As baseline model we chose gpt3.5-turbo-0125 and the judge was gpt-4-1106-preview.
|
78 |
|
79 |
<style>
|