Defts-lab commited on
Commit
2ddbe8e
β€’
1 Parent(s): df30ead

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -4
README.md CHANGED
@@ -1,8 +1,8 @@
1
  ---
2
- title: PDFtoPodcast
3
  emoji: πŸ“š
4
- colorFrom: blue
5
- colorTo: gray
6
  sdk: gradio
7
  sdk_version: 4.44.0
8
  app_file: app.py
@@ -10,4 +10,61 @@ pinned: false
10
  license: apache-2.0
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Pdf2audio
3
  emoji: πŸ“š
4
+ colorFrom: yellow
5
+ colorTo: pink
6
  sdk: gradio
7
  sdk_version: 4.44.0
8
  app_file: app.py
 
10
  license: apache-2.0
11
  ---
12
 
13
+ # PDF to Audio Converter
14
+
15
+ This Gradio app converts PDFs into audio podcasts, lectures, summaries, and more. It uses OpenAI's GPT models for text generation and text-to-speech conversion.
16
+
17
+ ## Features
18
+
19
+ - Upload multiple PDF files
20
+ - Choose from different instruction templates (podcast, lecture, summary, etc.)
21
+ - Customize text generation and audio models
22
+ - Select different voices for speakers
23
+
24
+ ## How to Use
25
+
26
+ 1. Upload one or more PDF files
27
+ 2. Select the desired instruction template
28
+ 3. Customize the instructions if needed
29
+ 4. Click "Generate Audio" to create your audio content
30
+
31
+ ## Use in Colab
32
+
33
+ [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lamm-mit/PDF2Audio/blob/main/PDF2Audio.ipynb)
34
+
35
+ ## Audio Example
36
+
37
+ <audio controls>
38
+ <source src="https://raw.githubusercontent.com/lamm-mit/PDF2Audio/main/SciAgents%20discovery%20summary%20-%20example.mp3" type="audio/mpeg">
39
+ Your browser does not support the audio element.
40
+ </audio>
41
+
42
+ ## Note
43
+
44
+ This app requires an OpenAI API key to function.
45
+
46
+ ## Credits
47
+
48
+ This project was inspired by and based on the code available at [https://github.com/knowsuchagency/pdf-to-podcast](https://github.com/knowsuchagency/pdf-to-podcast) and [https://github.com/knowsuchagency/promptic](https://github.com/knowsuchagency/promptic).
49
+
50
+ GitHub repo: [lamm-mit/PDF2Audio](https://github.com/lamm-mit/PDF2Audio)
51
+
52
+ ```bibtex
53
+ @article{ghafarollahi2024sciagentsautomatingscientificdiscovery,
54
+ title={SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning},
55
+ author={Alireza Ghafarollahi and Markus J. Buehler},
56
+ year={2024},
57
+ eprint={2409.05556},
58
+ archivePrefix={arXiv},
59
+ primaryClass={cs.AI},
60
+ url={https://arxiv.org/abs/2409.05556},
61
+ }
62
+ @article{buehler2024graphreasoning,
63
+ title={Accelerating Scientific Discovery with Generative Knowledge Extraction, Graph-Based Representation, and Multimodal Intelligent Graph Reasoning},
64
+ author={Markus J. Buehler},
65
+ journal={Machine Learning: Science and Technology},
66
+ year={2024},
67
+ url={http://iopscience.iop.org/article/10.1088/2632-2153/ad7228},
68
+ }
69
+ ```
70
+