kyutai
/

moshika-candle-q8

Model card Files Files and versions Community

jegou commited on Sep 18

Commit

66f4552

•

1 Parent(s): c4b6c85

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -13,6 +13,9 @@ Moshi is a speech-text foundation model and full-duplex spoken dialogue framewor
 ## Model Details
 ### Model Description
 Moshi is a speech-text foundation model that casts spoken dialogue as speech-to-speech generation. Starting from a text language model backbone, Moshi generates speech as tokens from the residual quantizer of a neural audio codec, while modeling separately its own speech and that of the user into parallel streams. This allows for the removal of explicit speaker turns, and the modeling of arbitrary conversational dynamics.

 ## Model Details
+Candle version (Rust) quantized with 8-bits precision.
 ### Model Description
 Moshi is a speech-text foundation model that casts spoken dialogue as speech-to-speech generation. Starting from a text language model backbone, Moshi generates speech as tokens from the residual quantizer of a neural audio codec, while modeling separately its own speech and that of the user into parallel streams. This allows for the removal of explicit speaker turns, and the modeling of arbitrary conversational dynamics.