esb (End-to-End Speech Benchmark)

The End-to-end Speech Benchmark (ESB) is a benchmark for assessing ASR systems on a collection of eight speech recognition datasets. ESB consists of:

    🤗 Datasets

    📜 Official Checkpoints

    🏆 Leaderboard

The ESB datasets are sourced from 11 different domains and cover a range of audio and text distributions (speaking styles, background noise, transcription requirements). There is no restriction on architecture or training data: any system capable of processing audio inputs and generating the corresponding transcriptions is eligible to participate. The only constraint is that the same training and evaluation algorithms must be used across datasets and systems may not use any dataset-specific pre- or post-processing. The objective of ESB is to encourage the research of more generalisable, multi-domain ASR systems.

ESB was proposed in ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition. For more information, see the official paper on Arxiv.

End-to-End Speech Benchmark

AI & ML interests

spaces 1

leaderboard

models 42

esb/conformer-rnnt-chime4

esb/conformer-rnnt-switchboard

esb/conformer-rnnt-ami

esb/conformer-rnnt-earnings22

esb/conformer-rnnt-spgispeech

esb/conformer-rnnt-gigaspeech

esb/conformer-rnnt-voxpopuli

esb/conformer-rnnt-tedlium

esb/conformer-rnnt-common_voice

esb/conformer-rnnt-librispeech

datasets 2

esb/datasets

esb/diagnostic-dataset

AI & ML interests

Team members 1

spaces 1

leaderboard

models 42 Sort: Recently updated

datasets 2 Sort: Recently updated

models 42

datasets 2