To address the scalability challenges of traditional peer review in the era of AI-assisted science, researchers propose a taxonomy of AI-human collaboration and introduce the Paper Assistant Tool (PAT). PAT is an agentic AI framework designed to ingest full scientific manuscripts and produce comprehensive evaluations by checking theoretical results, validating experiments, and identifying potential flaws.
- PAT utilizes inference scaling techniques to identify deeper issues than single model calls, achieving a 34% improvement over zero-shot recall on mathematical errors in the SPOT benchmark.
- Pilot deployments at STOC and ICML demonstrate PAT's ability to identify critical errors and suggest substantive improvements to research papers as a pre-submission tool.
By catching errors early, PAT eases the cognitive burden placed on referees while preserving their control over the outcomes of the review process.