ProvenanceGuard introduces a source-aware verifier for MCP-based LLM agents that detects cross-source conflation by routing claims to specific evidence sources and comparing stated attribution with actual source ownership. It achieves block F1 of 0.802 and source accuracy of 0.858 on 260 source-eligible claims, outperforming source-blind baselines, and detects all injected attribution swaps in 50 clinical probes.
arxiv
arXiv cs.AI
·
8d ago
·
research
ProvenanceGuard: Source-Aware Factuality Verification for MCP-Based LLM Agents
from English
Importance 3/3
Beats a top-lab benchmark
New feature vs. leaders
arXiv cs.AI
OpenAI
Anthropic
Google DeepMind
AI agents
Reasoning models
Retrieval & RAG
Benchmarks
| Benchmark | Model | Score |
|---|---|---|
| SWE-bench Verified | ProvenanceGuard | 0.8null |