LLMLingua

Runtime error

App Files Files Community

iofu728 commited on Oct 27, 2023

Commit

d64a23a

•

1 Parent(s): c5b556d

Feature(LLMLingua): update the news

Browse files

Files changed (1) hide show

app.py +7 -2

app.py CHANGED Viewed

@@ -7,7 +7,7 @@ INTRO = """
 # LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models (EMNLP 2023) [[paper](https://arxiv.org/abs/2310.05736)]
 _Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, Yuqing Yang and Lili Qiu_
-This is an early demo of the prompt compression method LLMLingua.
 It should be noted that due to limited resources, we only provide the **GPT2-Small** size language model in this demo. Using the **LLaMA2-7B** as a small language model would result in a significant performance improvement, especially at high compression ratios.
@@ -19,10 +19,15 @@ To use it, upload your prompt and set the compression target.
 2. ✅ Set the target_token or compression ratio.
 3. 🤔 Try experimenting with different target compression ratios or other hyperparameters to optimize the performance.
-You can check our [repo](https://aka.ms/LLMLingua)!
 We also has a work to compress long context scenories, using less cost but even improve the downstream performance, LongLLMLingua.<br>
 [LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression](https://arxiv.org/abs/2310.06839) (Under Review).<br>
 """
 INTRO_EXAMPLES = '''

 # LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models (EMNLP 2023) [[paper](https://arxiv.org/abs/2310.05736)]
 _Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, Yuqing Yang and Lili Qiu_
+### This is an <b>early demo</b> of the prompt compression method LLMLingua and <b>the capabilities are limited</b>, restricted to using only the GPT-2 small size mode.
 It should be noted that due to limited resources, we only provide the **GPT2-Small** size language model in this demo. Using the **LLaMA2-7B** as a small language model would result in a significant performance improvement, especially at high compression ratios.
 2. ✅ Set the target_token or compression ratio.
 3. 🤔 Try experimenting with different target compression ratios or other hyperparameters to optimize the performance.
+You can check our [project page](https://llmlingua.com/)!
 We also has a work to compress long context scenories, using less cost but even improve the downstream performance, LongLLMLingua.<br>
 [LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression](https://arxiv.org/abs/2310.06839) (Under Review).<br>
+## News
+- 🎈 We launched a [project page](https://llmlingua.com/) showcasing real-world case studies, including RAG, Online Meetings, CoT, and Code;
+- 👾 LongLLMLingua has been incorporated into the [LlamaIndex pipeline](https://github.com/run-llama/llama_index/blob/main/llama_index/indices/postprocessor/longllmlingua.py), which is a widely used RAG framework.
 """
 INTRO_EXAMPLES = '''