The authors propose a multi-agent framework that sanitizes retrieved content in Retrieval-Augmented Generation (RAG) systems through semantic rewriting to prevent privacy leakage from malicious prompts. By employing three specialized agents for privacy extraction, semantic analysis, and reconstruction, the approach removes sensitive identifiers while preserving the core meaning of the text.

  • Evaluated on ChatDoctor and Wiki-PII datasets across six large language models, reducing targeted information exposure in LLaMA-3-8B from 144 instances to just 1.
  • Maintains strong contextual fidelity with a BLEU-1 score of 0.122, outperforming the SAGE method's 0.117.
  • Operates as an asynchronous preprocessing module where rewriting is executed as a one-time offline step, introducing no additional latency to online inference.

The framework effectively mitigates privacy risks in sensitive scenarios without compromising contextual accuracy or adding inference latency.