qgyd2021's picture
Update README.md
2c8fd68 verified
metadata
license: apache-2.0
metrics:
  - accuracy
pipeline_tag: audio-classification

说话识别

针对通话场景中的声音如:

sound description
bell 响铃
music 音乐
mute 静音(完全没有声音)
noise 噪音(声音比较大的噪音)
noise_mute 环境音(其实也是噪音, 但声音比较小)
voice 语音(用户说话的声音, 但如果是远场说话则被认为是环境音)
voicemail 语音信箱(运营商播报的语音信箱)
white_noise 白噪声(一般是电话线路导致的, 嗡嗡的声音)

些模型将以上声音区分为 "non_voice", "voice" 两种. 如下:

sound label
bell non_voice
music non_voice
mute non_voice
noise non_voice
noise_mute non_voice
voice voice
voicemail voice
white_noise voice

准确率:

sound accuracy
non_voice 95.27%
voice 95.48%
total 95.35%