MATCH: Modulating Attention via In-Context Retrieval for Long-Context Transformers
The authors propose MATCH, a framework that augments sparsified attention mechanisms with dynamically integrated in-context information to address the scalability bottlenecks of traditional attention in long-context scenarios.