Security and Privacy in Retrieval-Augmented Generation: Architectures, Threats, Defenses, and Future Directions

This survey examines the security and privacy challenges inherent in Retrieval-Augmented Generation (RAG) systems across centralized, on-device, federated, and hybrid paradigms. It presents a unified taxonomy of threat surfaces that span retrieval, context construction, and generation stages. The analysis covers specific attack classes including membership inference, index inference, poisoning, gradient leakage, and collusion. Sensitive information risks are identified within retrieval indices, query logs, context construction, and federated updates. Adversarial manipulation of knowledge bases is highlighted as a key factor undermining trust in generated outputs. The paper reviews architectural, algorithmic, and cryptographic defenses while addressing privacy-utility trade-offs. Finally, it outlines open research challenges for building trustworthy and resilient RAG systems.