Know Before You Fetch: Calibrated Retrieval-Budget Allocation for Retrieval-Augmented Generation
This article introduces an adaptive RAG framework that allocates retrieval budgets by calibrating sequence log-probability and prefix-logit uncertainty signals into probabilities of correctness. The system decides whether to answer closed-book, retrieve a compact context (k=1), retrieve a full context (k=5), or abstain based on these calibrated probabilities.