File size: 7,466 Bytes
50ad069
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
[
    {
        "id": "rhythm_audio_0",
        "input_path": "/input/rhythm/audio_0.mp3",
        "text": "Say the following sentence very slowly: \"The quick brown fox jumps over the lazy dog.\"",
        "task": "Rhythm control capabilities",
        "task_description": "Can the model adjust the output pace, speaking faster or slower as required?",
        "output_path_4o": "/output/ChatGPT-4o/rhythm/audio_0/audio_0.wav",
        "output_path_miniomni": "/output/Mini-Omni/rhythm/00.wav",
        "output_path_speechgpt": "/output/SpeechGPT/rhythm/answer_0.wav",
        "output_path_funaudio": "/output/FunAudioLLM/rhythm/audio_0.wav",
        "text_cn": "非常缓慢地说以下句子:'快速的棕狐跳过了懒狗。",
        "language": "English",
        "category": "Education",
        "output_path_4o_cascade": "/output/cascade/rhythm/audio_0.wav",
        "output_path_4o_llama_omni": "/output/LLaMA_omni/rhythm/audio_0.wav",
        "level": "L2"
    },
    {
        "id": "rhythm_audio_3",
        "input_path": "/input/rhythm/audio_3.mp3",
        "text": "Say the first half of this sentence quickly and the second half slowly: \"In a world full of challenges, we need to find creative solutions to make things better.\"",
        "task": "Rhythm control capabilities",
        "task_description": "Can the model adjust the output pace, speaking faster or slower as required?",
        "output_path_4o": "/output/ChatGPT-4o/rhythm/audio_3/audio_3.wav",
        "output_path_miniomni": "/output/Mini-Omni/rhythm/01.wav",
        "output_path_speechgpt": "/output/SpeechGPT/rhythm/answer_3.wav",
        "output_path_funaudio": "/output/FunAudioLLM/rhythm/audio_1.wav",
        "text_cn": "在这句话的上半年中迅速和下半部分说:“在一个充满挑战的世界中,我们需要找到创造性的解决方案来使事情变得更好。",
        "language": "English",
        "category": "Education",
        "output_path_4o_cascade": "/output/cascade/rhythm/audio_3.wav",
        "output_path_4o_llama_omni": "/output/LLaMA_omni/rhythm/audio_3.wav",
        "level": "L2"
    },
    {
        "id": "rhythm_audio_7",
        "input_path": "/input/rhythm/audio_7.mp3",
        "text": "First, say this sentence as fast as you can, then repeat it very slowly: \"The rain in Spain falls mainly in the plain.\"",
        "task": "Rhythm control capabilities",
        "task_description": "Can the model adjust the output pace, speaking faster or slower as required?",
        "output_path_4o": "/output/ChatGPT-4o/rhythm/audio_7/audio_7.wav",
        "output_path_miniomni": "/output/Mini-Omni/rhythm/02.wav",
        "output_path_speechgpt": "/output/SpeechGPT/rhythm/answer_7.wav",
        "output_path_funaudio": "/output/FunAudioLLM/rhythm/audio_2.wav",
        "text_cn": "首先,尽可能快地说这句话,然后非常缓慢地重复:'西班牙的降雨主要落在平原上。",
        "language": "English",
        "category": "Education",
        "output_path_4o_cascade": "/output/cascade/rhythm/audio_7.wav",
        "output_path_4o_llama_omni": "/output/LLaMA_omni/rhythm/audio_7.wav",
        "level": "L2"
    },
    {
        "id": "rhythm_audio_18",
        "input_path": "/input/rhythm/audio_18.mp3",
        "text": "Repeat the phrase 'Practice makes perfect' three times, each time at a different speed: first very slowly, then moderately, and finally very fast.",
        "task": "Rhythm control capabilities",
        "task_description": "Can the model adjust the output pace, speaking faster or slower as required?",
        "output_path_4o": "/output/ChatGPT-4o/rhythm/audio_18/audio_18.wav",
        "output_path_miniomni": "/output/Mini-Omni/rhythm/03.wav",
        "output_path_speechgpt": "/output/SpeechGPT/rhythm/answer_18.wav",
        "output_path_funaudio": "/output/FunAudioLLM/rhythm/audio_3.wav",
        "text_cn": "每次以不同的速度重复三次“练习完美”一词:首先非常缓慢,然后适度,最后非常快。",
        "language": "English",
        "category": "Education",
        "output_path_4o_cascade": "/output/cascade/rhythm/audio_18.wav",
        "output_path_4o_llama_omni": "/output/LLaMA_omni/rhythm/audio_18.wav",
        "level": "L2"
    },
    {
        "id": "rhythm_audio_22",
        "input_path": "/input/rhythm/audio_22.mp3",
        "text": "Say the following sentence as if you're whispering: \"This is a secret, you can't tell anyone.\"",
        "task": "Rhythm control capabilities",
        "task_description": "Can the model adjust the output pace, speaking faster or slower as required?",
        "output_path_4o": "/output/ChatGPT-4o/rhythm/audio_22/audio_22.wav",
        "output_path_miniomni": "/output/Mini-Omni/rhythm/04.wav",
        "output_path_speechgpt": "/output/SpeechGPT/rhythm/answer_22.wav",
        "output_path_funaudio": "/output/FunAudioLLM/rhythm/audio_4.wav",
        "text_cn": "说以下句子,好像您在窃窃私语:“这是一个秘密,您不能告诉任何人。",
        "language": "English",
        "category": "Education",
        "output_path_4o_cascade": "/output/cascade/rhythm/audio_22.wav",
        "output_path_4o_llama_omni": "/output/LLaMA_omni/rhythm/audio_22.wav",
        "level": "L2"
    },
    {
        "id": "rhythm_audio_23",
        "input_path": "/input/rhythm/audio_23.mp3",
        "text": "Repeat this sentence very loudly as if you are trying to get someone's attention: \"Watch out! There's something coming your way!\"",
        "task": "Rhythm control capabilities",
        "task_description": "Can the model adjust the output pace, speaking faster or slower as required?",
        "output_path_4o": "/output/ChatGPT-4o/rhythm/audio_23/audio_23.wav",
        "output_path_miniomni": "/output/Mini-Omni/rhythm/05.wav",
        "output_path_speechgpt": "/output/SpeechGPT/rhythm/answer_23.wav",
        "output_path_funaudio": "/output/FunAudioLLM/rhythm/audio_5.wav",
        "text_cn": "大声重复这句话,好像您想引起某人的注意:“当心!有一些事情!",
        "language": "English",
        "category": "Education",
        "output_path_4o_cascade": "/output/cascade/rhythm/audio_23.wav",
        "output_path_4o_llama_omni": "/output/LLaMA_omni/rhythm/audio_23.wav",
        "level": "L2"
    },
    {
        "id": "rhythm_input_audio_3",
        "input_path": "/input/rhythm/input_audio_3.wav",
        "text": "Say the following sentence at my speed first, then say it again very slowly: \"Artificial intelligence is changing the world in many ways.\"",
        "task": "Rhythm control capabilities",
        "task_description": "Can the model adjust the output pace, speaking faster or slower as required?",
        "output_path_4o": "/output/ChatGPT-4o/rhythm/4o_audio_3.wav",
        "output_path_miniomni": "/output/Mini-Omni/rhythm/mini-omni_03.wav",
        "output_path_speechgpt": "/output/SpeechGPT/rhythm/SpeechGPT_answer_3.wav",
        "output_path_funaudio": "/output/FunAudioLLM/rhythm/FunAudio_audio_3.1.wav",
        "text_cn": "先按我的速度说出下面这句话,然后再慢慢地说一遍:“人工智能正在以多种方式改变世界。",
        "language": "English",
        "category": "Education",
        "output_path_4o_cascade": "/output/cascade/rhythm/input_audio_3.wav",
        "output_path_4o_llama_omni": "/output/LLaMA_omni/rhythm/input_audio_3.wav",
        "level": "L3"
    }
]