Open Whisper-style Speech Models (OWSM)

pyf98 's Collections

updated Sep 27

Fully open Whisper-style speech foundation models developed by CMU WAVLab: https://www.wavlab.org/activities/2024/owsm/

Upvote

Sleeping

55

🔊

OWSM Demo
espnet/owsm_ctc_v3.1_1B

Automatic Speech Recognition • Updated 24 days ago • 215 • 12

Note (ACL'24) CTC-based non-autoregressive speech foundation model for multilingual ASR, ST, and LID.
espnet/owsm_ctc_v3.2_ft_1B

Automatic Speech Recognition • Updated 24 days ago • 14 • 1

Note OWSM-CTC v3.1 is further fine-tuned on v3.2 data to improve long-form robustness.
espnet/owsm_v3.1_ebf

Automatic Speech Recognition • Updated Jan 31 • 232 • 13

Note (INTERSPEECH'24) OWSM v3.1 medium with 1.02B parameters.
espnet/owsm_v3.1_ebf_small

Automatic Speech Recognition • Updated Sep 2 • 53 • 2

Note (INTERSPEECH'24) OWSM v3.1 small with 367M parameters.
espnet/owsm_v3.1_ebf_base

Automatic Speech Recognition • Updated Jan 31 • 158 • 3

Note (INTERSPEECH'24) OWSM v3.1 base with 101M parameters.
espnet/owsm_v3.1_ebf_small_lowrestriction

Automatic Speech Recognition • Updated Sep 3 • 13 • 2

Note (INTERSPEECH'24) OWSM v3.1 small trained on a subset of data with low restriction licenses.
espnet/owsm_v3.2

Automatic Speech Recognition • Updated Aug 26 • 15 • 3

Note (INTERSPEECH'24) OWSM small with data cleaning.
espnet/owsm_v3

Automatic Speech Recognition • Updated Jan 31 • 21 • 27
espnet/owsm_v2_ebranchformer

Automatic Speech Recognition • Updated Oct 30, 2023 • 10
espnet/owsm_v2

Automatic Speech Recognition • Updated Jul 29, 2023 • 9 • 4
espnet/owsm_v1

Automatic Speech Recognition • Updated Oct 19, 2023 • 9
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification

Paper • 2402.12654 • Published Feb 20 • 1
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer

Paper • 2401.16658 • Published Jan 30 • 13
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data

Paper • 2309.13876 • Published Sep 25, 2023 • 1

Upvote

Open Whisper-style Speech Models (OWSM)

OWSM Demo