|
--- |
|
license: apache-2.0 |
|
metrics: |
|
- accuracy |
|
pipeline_tag: audio-classification |
|
--- |
|
## 说话识别 |
|
|
|
针对通话场景中的声音如: |
|
|
|
| sound | description | |
|
| :------- | :------- | |
|
| bell | 响铃 | |
|
| music | 音乐 | |
|
| mute | 静音(完全没有声音) | |
|
| noise | 噪音(声音比较大的噪音) | |
|
| noise_mute | 环境音(其实也是噪音, 但声音比较小) | |
|
| voice | 语音(用户说话的声音, 但如果是远场说话则被认为是环境音) | |
|
| voicemail | 语音信箱(运营商播报的语音信箱) | |
|
| white_noise | 白噪声(一般是电话线路导致的, 嗡嗡的声音) | |
|
|
|
些模型将以上声音区分为 "non_voice", "voice" 两种. 如下: |
|
|
|
| sound | label | |
|
| :------- | :------- | |
|
| bell | non_voice | |
|
| music | non_voice | |
|
| mute | non_voice | |
|
| noise | non_voice | |
|
| noise_mute | non_voice | |
|
| voice | voice | |
|
| voicemail | voice | |
|
| white_noise | voice | |
|
|
|
准确率: |
|
|
|
| sound | accuracy | |
|
| :------- | :------- | |
|
| non_voice | 95.27% | |
|
| voice | 95.48% | |
|
| total | 95.35% | |
|
|