Chan-Y commited on
Commit
945345c
1 Parent(s): 6fd2aa7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +79 -65
README.md CHANGED
@@ -1,65 +1,79 @@
1
- ---
2
- library_name: transformers
3
- license: mit
4
- base_model: microsoft/speecht5_tts
5
- tags:
6
- - generated_from_trainer
7
- model-index:
8
- - name: speecht5_tr_commonvoice_2
9
- results: []
10
- ---
11
-
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
14
-
15
- # speecht5_tr_commonvoice_2
16
-
17
- This model is a fine-tuned version of [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts) on the None dataset.
18
- It achieves the following results on the evaluation set:
19
- - Loss: 0.5934
20
-
21
- ## Model description
22
-
23
- More information needed
24
-
25
- ## Intended uses & limitations
26
-
27
- More information needed
28
-
29
- ## Training and evaluation data
30
-
31
- More information needed
32
-
33
- ## Training procedure
34
-
35
- ### Training hyperparameters
36
-
37
- The following hyperparameters were used during training:
38
- - learning_rate: 1e-06
39
- - train_batch_size: 8
40
- - eval_batch_size: 2
41
- - seed: 42
42
- - gradient_accumulation_steps: 8
43
- - total_train_batch_size: 64
44
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
45
- - lr_scheduler_type: linear
46
- - lr_scheduler_warmup_steps: 500
47
- - training_steps: 4000
48
- - mixed_precision_training: Native AMP
49
-
50
- ### Training results
51
-
52
- | Training Loss | Epoch | Step | Validation Loss |
53
- |:-------------:|:------:|:----:|:---------------:|
54
- | 0.7533 | 1.2972 | 1000 | 0.6445 |
55
- | 0.6745 | 2.5945 | 2000 | 0.6106 |
56
- | 0.6535 | 3.8917 | 3000 | 0.5953 |
57
- | 0.6593 | 5.1889 | 4000 | 0.5934 |
58
-
59
-
60
- ### Framework versions
61
-
62
- - Transformers 4.46.3
63
- - Pytorch 2.5.1+cu124
64
- - Datasets 3.1.0
65
- - Tokenizers 0.20.3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: mit
4
+ base_model: microsoft/speecht5_tts
5
+ tags:
6
+ - generated_from_trainer
7
+ model-index:
8
+ - name: speecht5_tr_commonvoice_2
9
+ results: []
10
+ ---
11
+
12
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
+ should probably proofread and complete it, then remove this comment. -->
14
+
15
+ # speecht5_tr_commonvoice_2
16
+
17
+ This model is a fine-tuned version of [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts) on the None dataset.
18
+ It achieves the following results on the evaluation set:
19
+ - Loss: 0.5934
20
+
21
+ ## Model description
22
+ ```python
23
+ import torch
24
+ from transformers import pipeline
25
+ from datasets import load_dataset
26
+ import soundfile as sf
27
+
28
+ embeddings_dataset = load_dataset("Matthijs/cmu-arctic-xvectors", split="validation")
29
+ speaker_embedding = torch.tensor(embeddings_dataset[7306]["xvector"]).unsqueeze(0)
30
+
31
+ from transformers import pipeline
32
+ pipe = pipeline("text-to-audio", model="Chan-Y/speecht5_finetuned_tr_commonvoice")
33
+
34
+ text = "bugün okula erken geldim, çalışmam lazım. çok sıkıcı bir dersim var."
35
+ result = pipe(text, forward_params={"speaker_embeddings": speaker_embedding})
36
+
37
+ sf.write("speech.wav", result["audio"], samplerate=result["sampling_rate"])
38
+
39
+ from IPython.display import Audio
40
+ Audio("speech.wav")
41
+ ```
42
+
43
+ ## Training and evaluation data
44
+
45
+ I used [CommonVoice Turkish Corpus 19.0](https://commonvoice.mozilla.org/tr/datasets)
46
+
47
+ ## Training procedure
48
+
49
+ ### Training hyperparameters
50
+
51
+ The following hyperparameters were used during training:
52
+ - learning_rate: 1e-06
53
+ - train_batch_size: 8
54
+ - eval_batch_size: 2
55
+ - seed: 42
56
+ - gradient_accumulation_steps: 8
57
+ - total_train_batch_size: 64
58
+ - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
59
+ - lr_scheduler_type: linear
60
+ - lr_scheduler_warmup_steps: 500
61
+ - training_steps: 4000
62
+ - mixed_precision_training: Native AMP
63
+
64
+ ### Training results
65
+
66
+ | Training Loss | Epoch | Step | Validation Loss |
67
+ |:-------------:|:------:|:----:|:---------------:|
68
+ | 0.7533 | 1.2972 | 1000 | 0.6445 |
69
+ | 0.6745 | 2.5945 | 2000 | 0.6106 |
70
+ | 0.6535 | 3.8917 | 3000 | 0.5953 |
71
+ | 0.6593 | 5.1889 | 4000 | 0.5934 |
72
+
73
+
74
+ ### Framework versions
75
+
76
+ - Transformers 4.46.3
77
+ - Pytorch 2.5.1+cu124
78
+ - Datasets 3.1.0
79
+ - Tokenizers 0.20.3