AI agents
arxiv arXiv cs.AI · 8d ago

Agentic AI Framework Reduces Diagnostic Errors in Healthcare

A multi-agent AI framework addresses premature diagnostic handoff and silent hallucinations in healthcare by enforcing structured clinical protocol completion and epistemic uncertainty quantification. Evaluations on 150 simulated cases show 49.3% diagnostic precision, an 11.3 percentage point improvement over baseline, with a statistically significant negative correlation between OLDCARTS completeness and diagnostic uncertainty.

arxiv arXiv cs.AI · 8d ago

Meta-Knowledge Reutilization in Reinforcement Learning

A new framework learns task-level knowledge on a simplified agent and transfers it to heterogeneous agents. It uses Bayesian non-parametric priors and a high-level policy to generate task guidance, with a semantic-magnitude interface and temporal adaptor to align meta-knowledge with embodiment-specific controllers. Experiments show 94.75% to 99.79% reduction in final-step tracking error and comparable performance using 23.8% of the interaction data of state-of-the-art methods.

arxiv arXiv cs.AI · 8d ago

Flash Endurance as Depreciating Capital in Robot Memory

A robot's flash memory endurance is a non-renewable asset that degrades with each write. A wear-aware pricing model introduces a shadow price $η$ to guide memory placement across RAM, NVM, and cloud, with optimal routing depending on the value-write association $χ$. Empirical measurements show $χ$ is positive in long-horizon manipulation, null in short-horizon tasks, and negative in teleoperation, and the endurance budget is binding only on low-end QLC/eMMC memory, where wear-aware control influences routing based on task value without improving performance.

arxiv arXiv cs.AI · 8d ago

IUU+DB: LLM-Driven Database for Illegal Fishing and Supply Chain Crimes

IUU+DB is a large language model-driven system that tracks illegal, unreported, and unregulated fishing, seafood fraud, and labor abuse. It extracts key data elements from diverse documents, classifies relevant incidents, and enables trend analysis to identify geographic and behavioral hotspots. The system supports research, risk assessments, and policy enforcement in fisheries and supply chains.

arxiv arXiv cs.AI · 8d ago

Visual Verification Enables Inference-time Steering and Autonomous Policy Improvement

VERITAS introduces a generator-verifier framework that enables robots to improve policies in real time without additional training. A visual verifier evaluates actions at inference time, allowing consistent performance gains through verified rollouts that serve as effective supervision for offline policy improvement. Post-training with these verified rollouts matches expert demonstrations in efficiency, without human intervention.