All articles
arxiv arXiv cs.CL · 8h ago

PolicyGuard: A Dialogue-Grounded Sub-Agent Verifier for Policy Adherence in LLM Agents

Researchers introduce PolicyGuard, a sub-agent verifier designed to improve policy adherence in LLM agents by reasoning over the full dialogue context rather than relying on external checks of individual arguments. This approach addresses the limitations of prior safeguarding methods that often underestimate the need for conversation-specific remediation and explicit user confirmation.

arxiv arXiv cs.CL · 8h ago

Travel-Oriented Reasoning Large Language Model via Domain-Specific Knowledge Graphs

Researchers propose a modular pipeline for building a travel-domain reasoning large language model grounded in an expert-designed knowledge graph to address accuracy and reliability issues in specialized domains. The approach integrates a travel knowledge graph, a bottom-up construction procedure for multi-hop question-answer pairs, and supervised fine-tuning to embed domain knowledge as auditable reasoning traces.

arxiv arXiv cs.CL · 8h ago

The Complexity Ceiling Benchmark: A Multi-Domain Evaluation of Sequential Reasoning Under Depth Scaling

The Complexity Ceiling Benchmark (CCB) evaluates how language model reasoning decays as the required sequential steps increase, fixing semantic content while varying task depth from 5 to 50. The study reveals consistent geometric per-step decay across three distinct regimes: grounded spatial state-tracking, abstract symbolic pointer manipulation, and transitive relational inference.

arxiv arXiv cs.CL · 8h ago

Deterministic Decisions for High-Stakes AI

The article identifies "intervention bias" as a critical failure mode in zero-shot large-language-model educational advisory agents, where they incorrectly recommend action despite oracle policies mandating inaction. Using the Open University Learning Analytics Dataset, the study demonstrates that zero-shot GPT-4o exhibits a 43 percentage-point false-positive rate at day 56, leading to approximately 4,300 unnecessary advisor contacts per cycle for 10,000 students.

arxiv arXiv cs.LG · 9h ago

AsyncOPD: How Stale Can On-Policy Distillation Be?

This article presents AsyncOPD, a fully asynchronous on-policy distillation pipeline that decouples rollout generation from learner updates to alleviate training bottlenecks in large language model post-training. The authors provide the first systematic study of staleness effects in this context, demonstrating that teacher-weighted forward KL is robust to stale rollouts while student-weighted reverse KL is vulnerable.