A study measures tool-intent stabilization in Streaming RAG, defining when speculative tool queries converge to correct answers. On the CRAG benchmark, 73.9% of queries allow substantial latency hiding, with early stabilization observed in questions with verbatim retrievable evidence. Question type significantly predicts early versus late stabilization, informing when speculative triggers are effective.
arxiv
arXiv cs.CL
·
6d ago
·
research
Tool-Intent Stabilization in Streaming RAG
from English
Importance 2/3
arXiv cs.CL
Hugging Face
Cohere
Microsoft Research
Evaluation & benchmarks
Research paper
Retrieval & RAG
Benchmarks
| Benchmark | Model | Score |
|---|---|---|
| CRUXEval | null | 73.9% |