Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts
Abstract
In the era of large language models, applying techniques such as Retrieval Augmented Generation can better address Open-Domain Question-Answering problems. Due to constraints including model sizes and computing resources, the length of context is often limited, and it becomes challenging to empower the model to cover overlong contexts while answering questions from open domains. This paper proposes a general and convenient method to covering longer contexts in Open-Domain Question-Answering tasks. It leverages a small encoder language model that effectively encodes contexts, and the encoding applies cross-attention with origin inputs. With our method, the origin language models can cover several times longer contexts while keeping the computing requirements close to the baseline. Our experiments demonstrate that after fine-tuning, there is improved performance across two held-in datasets, four held-out datasets, and also in two In Context Learning settings.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference (2024)
- LLoCO: Learning Long Contexts Offline (2024)
- Imagination Augmented Generation: Learning to Imagine Richer Context for Question Answering over Large Language Models (2024)
- FlashBack:Efficient Retrieval-Augmented Language Modeling for Long Context Inference (2024)
- Towards Better Generalization in Open-Domain Question Answering by Mitigating Context Memorization (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper