Efficient Retrieval-Augmented Generation via Token Co-occurrence Graphs

Researchers propose TIGRAG (Token-Induced GraphRAG), a framework that uses token co-occurrence statistics to build scalable knowledge graphs for efficient retrieval-augmented generation. This approach addresses the limitations of standard RAG in multi-hop reasoning by avoiding expensive LLM-based extraction pipelines.

TIGRAG constructs graphs using sliding-window co-occurrence statistics, enabling scalable graph construction without complex extraction steps. The system combines graph-based semantic expansion and neural reranking to retrieve interconnected evidence for multi-hop reasoning. It introduces an iterative entity-driven retrieval strategy that progressively expands queries using bridging entities from previously retrieved contexts.

Experimental results on three multi-hop Question Answering benchmarks show TIGRAG consistently outperforms dense retrieval and graph-based RAG methods in both retrieval and downstream tasks while reducing indexing time, inference latency, and prompt footprint.