vatsal-metavoice commited on
Commit
14aac75
1 Parent(s): e007fe7

feat: update README

Browse files
Files changed (1) hide show
  1. README.md +3 -34
README.md CHANGED
@@ -2,6 +2,8 @@
2
  license: apache-2.0
3
  language:
4
  - en
 
 
5
  ---
6
 
7
  MetaVoice-1B is a 1.2B parameter base model trained on 100K hours of speech for TTS (text-to-speech). It has been built with the following priorities:
@@ -13,38 +15,8 @@ MetaVoice-1B is a 1.2B parameter base model trained on 100K hours of speech for
13
 
14
  We’re releasing MetaVoice-1B under the Apache 2.0 license, *it can be used without restrictions*.
15
 
16
- ## Installation
17
- ```bash
18
- # install ffmpeg
19
- wget https://johnvansickle.com/ffmpeg/builds/ffmpeg-git-amd64-static.tar.xz
20
- wget https://johnvansickle.com/ffmpeg/builds/ffmpeg-git-amd64-static.tar.xz.md5
21
- md5sum -c ffmpeg-git-amd64-static.tar.xz.md5
22
- tar xvf ffmpeg-git-amd64-static.tar.xz
23
- sudo mv ffmpeg-git-*-static/ffprobe ffmpeg-git-*-static/ffmpeg /usr/local/bin/
24
- rm -rf ffmpeg-git-*
25
-
26
- pip install -r requirements.txt
27
- pip install -e .
28
- ```
29
-
30
- ## Download
31
- ```
32
- wget https://cdn.themetavoice.xyz/metavoice-1B-v0.1.tar
33
- tar -xvf metavoice-1B-v0.1.tar
34
- ```
35
-
36
  ## Usage
37
- 1. [Download it](https://cdn.themetavoice.xyz/metavoice-1B-v0.1.tar) and use it anywhere (including locally) with our [reference implementation](/fam/llm/sample.py),
38
- ```bash
39
- python fam/llm/sample.py --model_dir=<PATH_TO_MODEL_DIR> --spk_cond_path=<PATH_TO_TARGET_AUDIO>
40
- ```
41
-
42
- 2. Deploy it on any cloud (AWS/GCP/Azure), using our [inference server](/fam/llm/serving.py)
43
- ```bash
44
- python fam/llm/serving.py --model_dir=<PATH_TO_MODEL_DIR>
45
- ```
46
-
47
- 3. Use it on HuggingFace
48
 
49
  ## Soon
50
  - Long form TTS
@@ -66,6 +38,3 @@ We predict EnCodec tokens from text, and speaker information. This is then diffu
66
  The model supports:
67
  1. KV-caching via Flash Decoding
68
  2. Batching (including texts of different lengths)
69
-
70
- ## Contribute
71
- - See all [active issues](https://github.com/themetavoicexyz/issues)!
 
2
  license: apache-2.0
3
  language:
4
  - en
5
+ tags:
6
+ - pretrained
7
  ---
8
 
9
  MetaVoice-1B is a 1.2B parameter base model trained on 100K hours of speech for TTS (text-to-speech). It has been built with the following priorities:
 
15
 
16
  We’re releasing MetaVoice-1B under the Apache 2.0 license, *it can be used without restrictions*.
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ## Usage
19
+ See [Github](https://github.com/metavoiceio/metavoice-src) for the latest usage instructions.
 
 
 
 
 
 
 
 
 
 
20
 
21
  ## Soon
22
  - Long form TTS
 
38
  The model supports:
39
  1. KV-caching via Flash Decoding
40
  2. Batching (including texts of different lengths)