MUSTAR
/

Rigel-rvc-base-pretrained-model

Model card Files Files and versions Community

Edit model card

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Rigel Pretrained Model

Base and Fine tuned models

Dataset

Size: Total 1921 hours of speech and vocals.
Languages:
- Arabic: ~70 hours
- Chinese (Mandarin): ~70 hours
- English: ~800 hours
- French: ~42 hours
- German: ~35 hours
- Hindi: ~30 hours
- Indonesian: ~53 hours
- Japanese: ~140 hours
- Korean: ~80 hours
- Portuguese: ~40 hours
- Russian: ~188 hours
- Singing (all languages): ~190 hours
- Spanish: ~200 hours
- Tagalog: ~30 hours
- Common language: Unknown amount

Sampling Frequency

32kHz (Done)
40kHz (Retraining)

Models

Base Model

Data: Total 1921 hours of low-mid quality data.
Steps: 3,890,220
Batch: 40
Precision: FP32
Sampling Rate: 32k

Fine-Tuned Model

Data: 102 hours of high-quality data.
Steps: 2,854,856
Batch: 20
Precision: FP32
Sampling Rate: 32k

Hardware Used

CPU: AMD EPYC 9754
RAM: 256GB
GPUs:
- 1 x H100
- 4 x L40s
- 1 x RTX 4080
- 1 x RTX 4070 Ti

Expected Release Date

July 22nd

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference API

Unable to determine this model's library. Check the docs .