Fredric commited on
Commit
dab5cce
β€’
0 Parent(s):

Initial commit

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Emotional TTS Comparison
2
+
3
+ This project explores ways to incorporate emotion into Text-to-Speech (TTS) using OpenAI's GPT-4 for text modification and TTS-1 for speech synthesis.
4
+
5
+ ## Background
6
+
7
+ While some TTS systems like Bark can include descriptive elements in speech (e.g., "(큰 μ†Œλ¦¬λ‘œ) μœ„ν—˜ν•΄μš”!"), they may have quality issues with noise. This project aims to find a method to convey emotion using OpenAI's TTS while maintaining high audio quality.
8
+
9
+ ## How It Works
10
+
11
+ 1. The user inputs a text.
12
+ 2. The system generates three versions of the text:
13
+ - Original: The input text as-is
14
+ - Emotional: A slightly more emotional version
15
+ - Exaggerated: A highly emotional, exaggerated version
16
+ 3. Each version is then converted to speech using OpenAI's TTS-1 model.
17
+
18
+ ## Example
19
+
20
+ Original: "μœ„ν—˜ν•΄μš”"
21
+ Emotional: "μœ„ν—˜ν•΄μš”!!"
22
+ Exaggerated: "μž κΉλ§Œμš”! μ•ˆλΌ, μœ„ν—˜ν•΄μš”!!"
23
+
24
+ ## Features
25
+
26
+ - Uses GPT-4o-mini for text modification
27
+ - Employs OpenAI's TTS-1 for high-quality speech synthesis
28
+ - Provides a Gradio interface for easy interaction
29
+ - Allows comparison of different emotional intensities in speech
30
+
31
+ ## Usage
32
+
33
+ 1. Enter your text in the input box.
34
+ 2. Click "Generate Versions and Speech".
35
+ 3. Listen to and compare the three versions of the speech.
36
+
37
+ ## Deployment
38
+
39
+ This project is deployed on Hugging Face Spaces, allowing easy access and usage without local setup.
40
+
41
+ ## Note
42
+
43
+ This approach aims to strike a balance between conveying emotion and maintaining speech quality. It demonstrates how text modification can influence the perceived emotion in TTS output.
44
+