Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -1,8 +1,8 @@
|
|
1 |
---
|
2 |
-
title:
|
3 |
emoji: π
|
4 |
-
colorFrom:
|
5 |
-
colorTo:
|
6 |
sdk: gradio
|
7 |
sdk_version: 4.44.0
|
8 |
app_file: app.py
|
@@ -10,4 +10,61 @@ pinned: false
|
|
10 |
license: apache-2.0
|
11 |
---
|
12 |
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
title: Pdf2audio
|
3 |
emoji: π
|
4 |
+
colorFrom: yellow
|
5 |
+
colorTo: pink
|
6 |
sdk: gradio
|
7 |
sdk_version: 4.44.0
|
8 |
app_file: app.py
|
|
|
10 |
license: apache-2.0
|
11 |
---
|
12 |
|
13 |
+
# PDF to Audio Converter
|
14 |
+
|
15 |
+
This Gradio app converts PDFs into audio podcasts, lectures, summaries, and more. It uses OpenAI's GPT models for text generation and text-to-speech conversion.
|
16 |
+
|
17 |
+
## Features
|
18 |
+
|
19 |
+
- Upload multiple PDF files
|
20 |
+
- Choose from different instruction templates (podcast, lecture, summary, etc.)
|
21 |
+
- Customize text generation and audio models
|
22 |
+
- Select different voices for speakers
|
23 |
+
|
24 |
+
## How to Use
|
25 |
+
|
26 |
+
1. Upload one or more PDF files
|
27 |
+
2. Select the desired instruction template
|
28 |
+
3. Customize the instructions if needed
|
29 |
+
4. Click "Generate Audio" to create your audio content
|
30 |
+
|
31 |
+
## Use in Colab
|
32 |
+
|
33 |
+
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lamm-mit/PDF2Audio/blob/main/PDF2Audio.ipynb)
|
34 |
+
|
35 |
+
## Audio Example
|
36 |
+
|
37 |
+
<audio controls>
|
38 |
+
<source src="https://raw.githubusercontent.com/lamm-mit/PDF2Audio/main/SciAgents%20discovery%20summary%20-%20example.mp3" type="audio/mpeg">
|
39 |
+
Your browser does not support the audio element.
|
40 |
+
</audio>
|
41 |
+
|
42 |
+
## Note
|
43 |
+
|
44 |
+
This app requires an OpenAI API key to function.
|
45 |
+
|
46 |
+
## Credits
|
47 |
+
|
48 |
+
This project was inspired by and based on the code available at [https://github.com/knowsuchagency/pdf-to-podcast](https://github.com/knowsuchagency/pdf-to-podcast) and [https://github.com/knowsuchagency/promptic](https://github.com/knowsuchagency/promptic).
|
49 |
+
|
50 |
+
GitHub repo: [lamm-mit/PDF2Audio](https://github.com/lamm-mit/PDF2Audio)
|
51 |
+
|
52 |
+
```bibtex
|
53 |
+
@article{ghafarollahi2024sciagentsautomatingscientificdiscovery,
|
54 |
+
title={SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning},
|
55 |
+
author={Alireza Ghafarollahi and Markus J. Buehler},
|
56 |
+
year={2024},
|
57 |
+
eprint={2409.05556},
|
58 |
+
archivePrefix={arXiv},
|
59 |
+
primaryClass={cs.AI},
|
60 |
+
url={https://arxiv.org/abs/2409.05556},
|
61 |
+
}
|
62 |
+
@article{buehler2024graphreasoning,
|
63 |
+
title={Accelerating Scientific Discovery with Generative Knowledge Extraction, Graph-Based Representation, and Multimodal Intelligent Graph Reasoning},
|
64 |
+
author={Markus J. Buehler},
|
65 |
+
journal={Machine Learning: Science and Technology},
|
66 |
+
year={2024},
|
67 |
+
url={http://iopscience.iop.org/article/10.1088/2632-2153/ad7228},
|
68 |
+
}
|
69 |
+
```
|
70 |
+
|