Text-to-Speech
dual_ar

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

You agree to not use the model to generate contents that violate DMCA or local laws.

Log in or Sign Up to review the conditions and access this model content.

Fish Speech V1.5

Fish Speech V1.5 is a leading text-to-speech (TTS) model trained on more than 1 million hours of audio data in multiple languages.

Supported languages:

  • English (en) >300k hours
  • Chinese (zh) >300k hours
  • Japanese (ja) >100k hours
  • German (de) ~20k hours
  • French (fr) ~20k hours
  • Spanish (es) ~20k hours
  • Korean (ko) ~20k hours
  • Arabic (ar) ~20k hours
  • Russian (ru) ~20k hours
  • Dutch (nl) <10k hours
  • Italian (it) <10k hours
  • Polish (pl) <10k hours
  • Portuguese (pt) <10k hours

Please refer to Fish Speech Github for more info.
Demo available at Fish Audio.

Citation

If you found this repository useful, please consider citing this work:

@misc{fish-speech-v1.4,
      title={Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis}, 
      author={Shijia Liao and Yuxuan Wang and Tianyu Li and Yifan Cheng and Ruoyi Zhang and Rongzhi Zhou and Yijin Xing},
      year={2024},
      eprint={2411.01156},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2411.01156}, 
}

License

This model is permissively licensed under the BY-CC-NC-SA-4.0 license.

Downloads last month
5
Inference Examples
Inference API (serverless) has been turned off for this model.

Space using fishaudio/fish-speech-1.5 1