Reasoning models
arxiv arXiv cs.LG · 8d ago

Learning Fair Pareto-Optimal Policies in Multi-Objective Reinforcement Learning

The paper introduces a framework for multi-policy multi-objective reinforcement learning that learns a set of Pareto-optimal policies ensuring fairness across diverse user preferences. It proves fair policies remain within the convex coverage set for concave welfare functions and proposes three algorithms that incorporate non-stationary and stochastic policy dynamics. Empirical results show these methods effectively learn fair policies adaptable to varying user preferences.

arxiv arXiv cs.LG · 8d ago

Flash Endurance as Depreciating Capital in Robot Memory

A robot's flash memory degrades with each write, forming a non-renewable asset. A wear-aware pricing model uses a shadow price $η$ to guide memory placement across RAM, NVM, and cloud, with optimal routing depending on whether task value increases with memory persistence. The sign of the value-write association $χ$ varies by deployment: positive in long-horizon manipulation, null in short-horizon tasks, and negative in teleoperation. The endurance budget is binding only on low-end QLC/eMMC memory, and while wear-aware routing aligns with task value, actual performance improvements remain unverified in data.

arxiv arXiv cs.LG · 8d ago

Kolmogorov Regression for Robust Diffusion Policies

A backward Kolmogorov equation lifts diffusion policies to a Cameron-Martin space, replacing stochastic score matching with a deterministic PDE. This approach achieves convergence bounds tied to kernel effective rank, improves trajectory regularity, and enables a deterministic failure detector without rewards. Validation shows 17% higher reward on PushT and 28.4% lower RMSE on a manufacturing line, with 96% reduction in deadlock events via Hamilton-Jacobi certification.

arxiv arXiv cs.LG · 8d ago

ATT&CK-Labeled Multi-Source Cybersecurity Logs Dataset Released

A new dataset combines system, network, and browser logs from 870 Windows sessions, including 70 attacks and 800 benign cases. It provides per-event labels with MITRE ATT&CK technique IDs for 12 tactics and 53 techniques, using real attack tools like RAT and C2 tunnels. Fine-tuning three Small Language Models (SLMs) via LoRA improved chunk classification accuracy to 90–97% and achieved up to 42% exact-match accuracy in technique identification, showing strong reasoning capture despite challenges.

arxiv arXiv cs.CL · 8d ago

LegalHalluLens: Auditing Hallucinations in Legal AI

LegalHalluLens introduces a framework to audit AI hallucinations in legal contexts by analyzing typed hallucination profiles across four claim categories. It reveals a 38-40 point gap between obligation/numeric and temporal claims, and shows two systems with identical 52% hallucination rates can have opposite risk directions. The framework uses a Risk Direction Index and calibrated debate pipelines to reduce fabricated detections by 45% and improve accountability in legal AI deployment.

arxiv arXiv cs.CL · 8d ago

ProvenanceGuard: Source-Aware Factuality Verification for MCP-Based LLM Agents

ProvenanceGuard introduces a source-aware verifier for MCP-based LLM agents that detects cross-source conflation by routing claims to specific evidence sources and comparing stated attribution with actual source ownership. It achieves block F1 of 0.802 and source accuracy of 0.858 on 260 source-eligible claims, outperforming source-blind baselines, and detects all injected attribution swaps in 50 clinical probes.

arxiv arXiv cs.CL · 8d ago

AI's Synthetic Lived Experience in Caregiver Support

LLMs can generate peer-like responses that mimic personal narratives, creating a false impression of lived experience. Psycholinguistic analysis shows human peers use more first-person and past-focused language than AI, and AI often fabricates experiential grounding without real experience. This synthetic lived experience paradox risks misleading caregivers, necessitating mechanisms to distinguish supportive framing from fabricated experience.

arxiv arXiv cs.CL · 8d ago

RubricsTree: Scalable Evaluation Framework for Personal Health Agents

RubricsTree introduces a hierarchical taxonomy of over 100 clinically-verifiable Boolean rubrics, evolved from 4,000 real user queries via human-in-the-loop curation. It enables scalable, expert-aligned evaluation of personal health agents by dynamically routing queries to relevant rubrics and outperforms baseline methods in alignment, context sensitivity, and model performance gains of up to 66% on HealthBench.

arxiv arXiv cs.CL · 8d ago

Darshana Graph: A Corpus for Comparative Indian Philosophy

Darshana Graph presents a corpus of over 125,000 text records from Hindu, Buddhist, and Jain philosophical sources. It includes a unique subset of 8,500 aligned records from 18 commentators across five schools, enabling cross-commentator comparison. The corpus supports stylometric analysis and a large language model pipeline that extracts philosophical concept relationships, revealing disagreement patterns and extraction limitations.