poeroz commited on
Commit
887709d
β€’
1 Parent(s): 2c21386

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -10,11 +10,11 @@ tags:
10
  - speech-to-speech
11
  ---
12
 
13
- # 🎧 LLaMA-Omni: Seamless Speech Interaction with Large Language Models
14
 
15
  > **Authors: [Qingkai Fang](https://fangqingkai.github.io/), [Shoutao Guo](https://scholar.google.com/citations?hl=en&user=XwHtPyAAAAAJ), [Yan Zhou](https://zhouyan19.github.io/zhouyan/), [Zhengrui Ma](https://scholar.google.com.hk/citations?user=dUgq6tEAAAAJ), [Shaolei Zhang](https://zhangshaolei1998.github.io/), [Yang Feng*](https://people.ucas.edu.cn/~yangfeng?language=en)**
16
 
17
- [[Paper]](https://arxiv.org/abs/xxxx.xxxxx) [[Model]](https://huggingface.co/ICTNLP/Llama-3.1-8B-Omni) [[Code]](https://github.com/ictnlp/LLaMA-Omni)
18
 
19
  LLaMA-Omni is a speech-language model built upon Llama-3.1-8B-Instruct. It supports low-latency and high-quality speech interactions, simultaneously generating both text and speech responses based on speech instructions.
20
 
@@ -22,13 +22,13 @@ LLaMA-Omni is a speech-language model built upon Llama-3.1-8B-Instruct. It suppo
22
 
23
  ## πŸ’‘ Highlights
24
 
25
- πŸ’ͺ **Built on Llama-3.1-8B-Instruct, ensuring high-quality responses.**
26
 
27
- πŸš€ **Low-latency speech interaction with a latency as low as 226ms.**
28
 
29
- 🎧 **Simultaneous generation of both text and speech responses.**
30
 
31
- ♻️ **Trained in less than 3 days using just 4 GPUs.**
32
 
33
 
34
  <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/65b7573482d384513443875e/dr4XWUxzuVQ52lBuzNBTt.mp4"></video>
@@ -126,7 +126,7 @@ If our work is useful for you, please cite as:
126
  @article{fang-etal-2024-llama-omni,
127
  title={LLaMA-Omni: Seamless Speech Interaction with Large Language Models},
128
  author={Fang, Qingkai and Guo, Shoutao and Zhou, Yan and Ma, Zhengrui and Zhang, Shaolei and Feng, Yang},
129
- journal={arXiv preprint arXiv:xxxx.xxxxx},
130
  year={2024}
131
  }
132
  ```
 
10
  - speech-to-speech
11
  ---
12
 
13
+ # πŸ¦™πŸŽ§ LLaMA-Omni: Seamless Speech Interaction with Large Language Models
14
 
15
  > **Authors: [Qingkai Fang](https://fangqingkai.github.io/), [Shoutao Guo](https://scholar.google.com/citations?hl=en&user=XwHtPyAAAAAJ), [Yan Zhou](https://zhouyan19.github.io/zhouyan/), [Zhengrui Ma](https://scholar.google.com.hk/citations?user=dUgq6tEAAAAJ), [Shaolei Zhang](https://zhangshaolei1998.github.io/), [Yang Feng*](https://people.ucas.edu.cn/~yangfeng?language=en)**
16
 
17
+ [[Paper]](https://arxiv.org/abs/2409.06666) [[Model]](https://huggingface.co/ICTNLP/Llama-3.1-8B-Omni) [[Code]](https://github.com/ictnlp/LLaMA-Omni)
18
 
19
  LLaMA-Omni is a speech-language model built upon Llama-3.1-8B-Instruct. It supports low-latency and high-quality speech interactions, simultaneously generating both text and speech responses based on speech instructions.
20
 
 
22
 
23
  ## πŸ’‘ Highlights
24
 
25
+ - πŸ’ͺ **Built on Llama-3.1-8B-Instruct, ensuring high-quality responses.**
26
 
27
+ - πŸš€ **Low-latency speech interaction with a latency as low as 226ms.**
28
 
29
+ - 🎧 **Simultaneous generation of both text and speech responses.**
30
 
31
+ - ♻️ **Trained in less than 3 days using just 4 GPUs.**
32
 
33
 
34
  <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/65b7573482d384513443875e/dr4XWUxzuVQ52lBuzNBTt.mp4"></video>
 
126
  @article{fang-etal-2024-llama-omni,
127
  title={LLaMA-Omni: Seamless Speech Interaction with Large Language Models},
128
  author={Fang, Qingkai and Guo, Shoutao and Zhou, Yan and Ma, Zhengrui and Zhang, Shaolei and Feng, Yang},
129
+ journal={arXiv preprint arXiv:2409.06666},
130
  year={2024}
131
  }
132
  ```