Feature(LLMLingua): update the news
Browse files
app.py
CHANGED
|
@@ -7,7 +7,7 @@ INTRO = """
|
|
| 7 |
# LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models (EMNLP 2023) [[paper](https://arxiv.org/abs/2310.05736)]
|
| 8 |
_Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, Yuqing Yang and Lili Qiu_
|
| 9 |
|
| 10 |
-
This is an early demo of the prompt compression method LLMLingua.
|
| 11 |
|
| 12 |
It should be noted that due to limited resources, we only provide the **GPT2-Small** size language model in this demo. Using the **LLaMA2-7B** as a small language model would result in a significant performance improvement, especially at high compression ratios.
|
| 13 |
|
|
@@ -19,10 +19,15 @@ To use it, upload your prompt and set the compression target.
|
|
| 19 |
2. ✅ Set the target_token or compression ratio.
|
| 20 |
3. 🤔 Try experimenting with different target compression ratios or other hyperparameters to optimize the performance.
|
| 21 |
|
| 22 |
-
You can check our [
|
| 23 |
|
| 24 |
We also has a work to compress long context scenories, using less cost but even improve the downstream performance, LongLLMLingua.<br>
|
| 25 |
[LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression](https://arxiv.org/abs/2310.06839) (Under Review).<br>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
"""
|
| 27 |
|
| 28 |
INTRO_EXAMPLES = '''
|
|
|
|
| 7 |
# LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models (EMNLP 2023) [[paper](https://arxiv.org/abs/2310.05736)]
|
| 8 |
_Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, Yuqing Yang and Lili Qiu_
|
| 9 |
|
| 10 |
+
### This is an <b>early demo</b> of the prompt compression method LLMLingua and <b>the capabilities are limited</b>, restricted to using only the GPT-2 small size mode.
|
| 11 |
|
| 12 |
It should be noted that due to limited resources, we only provide the **GPT2-Small** size language model in this demo. Using the **LLaMA2-7B** as a small language model would result in a significant performance improvement, especially at high compression ratios.
|
| 13 |
|
|
|
|
| 19 |
2. ✅ Set the target_token or compression ratio.
|
| 20 |
3. 🤔 Try experimenting with different target compression ratios or other hyperparameters to optimize the performance.
|
| 21 |
|
| 22 |
+
You can check our [project page](https://llmlingua.com/)!
|
| 23 |
|
| 24 |
We also has a work to compress long context scenories, using less cost but even improve the downstream performance, LongLLMLingua.<br>
|
| 25 |
[LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression](https://arxiv.org/abs/2310.06839) (Under Review).<br>
|
| 26 |
+
|
| 27 |
+
## News
|
| 28 |
+
|
| 29 |
+
- 🎈 We launched a [project page](https://llmlingua.com/) showcasing real-world case studies, including RAG, Online Meetings, CoT, and Code;
|
| 30 |
+
- 👾 LongLLMLingua has been incorporated into the [LlamaIndex pipeline](https://github.com/run-llama/llama_index/blob/main/llama_index/indices/postprocessor/longllmlingua.py), which is a widely used RAG framework.
|
| 31 |
"""
|
| 32 |
|
| 33 |
INTRO_EXAMPLES = '''
|