AB-RAG is a training-free, backbone-agnostic framework that dynamically adjusts retrieval efforts based on a confidence estimate derived from model certainty, answer-evidence agreement, and retrieval score variance. This approach allows systems to decide whether to stop or retrieve more evidence within a fixed budget without retraining the underlying language model.
The confidence estimator combines three signals: direct model certainty (or self-consistency approximation for closed APIs), agreement between the generated answer and retrieved evidence, and the variance of retrieval scores. On a factoid dataset, this method achieved a clean separation of 57.6% versus 0% Exact Match between high- and low-confidence answers. The adaptive policy improves accuracy on capable backbones, though the study notes that the confidence signal was unsuitable for short answers.
This framework offers practical value for commercial API-based systems by optimizing computation usage and providing trust signals for generated answers without requiring model retraining or access to internal parameters.