AI agents — korshunov.ai

AI agents Page 1 / 20

Uncertainty Quantification for Flow-Based Vision-Language-Action Models

We propose a method using velocity-field disagreement to quantify epistemic uncertainty in flow-matching vision-language-action models. This uncertainty estimate enables failure detection during deployment and active fine-tuning via the SAVE framework, which reduces expert demonstrations by at least 22% compared to baselines, with better-calibrated predictions on the LIBERO benchmark.

arxiv arXiv cs.LG · 8d ago

Compositional Generalization in Language Model Reasoning

A hierarchical latent selection model shows that supervised fine-tuning and reinforcement learning work together to enable compositional generalization in language models. SFT provides raw module materials, while RL identifies and recombines atomic modules from compound traces to solve new problems. Training on compound traces leads to stronger generalization than isolated module training, and an effective protocol is found where SFT ensures module coverage and RL drives exploration of novel compositions.

arxiv arXiv cs.LG · 8d ago

OmniPlan: Adaptive Framework for Timely and Near-Optimal Network Planning

OmniPlan introduces an adaptive framework that converts natural-language user intents into quantifiable preferences using a large language model. It dynamically selects among mixed integer programming, heuristics, and deep reinforcement learning experts to achieve both timeliness and near-optimality in network planning. Evaluations on distributed machine learning workloads show up to 97.8% latency reduction and 11.5% lower resource consumption.

arxiv arXiv cs.LG · 8d ago

Handlebars Triple-Brace Injection Exploits Structural Role Delimiters

Handlebars' triple-brace interpolation fails to protect against structural role injection, as HTML escaping only neutralizes angle-bracket delimiters. It leaves colon and Markdown hash delimiters intact, enabling attackers to hijack model behavior. The default escaping provides no protection for most role delimiter schemes and cannot replace a clear separation of instructions and data.

arxiv arXiv cs.LG · 8d ago

Embedded ML Workflow for Microcontroller Edge Devices

This paper outlines a systems-oriented workflow for embedded machine learning on microcontroller-class devices. It details key engineering decisions such as data sampling, feature extraction, class imbalance validation, model-runtime co-design, and streaming deployment, using inertial motion recognition and keyword spotting as case studies. The work provides practical design rules for robust on-device inference, including data curation, quantization, thresholding, scheduling, and field monitoring.

arxiv arXiv cs.LG · 8d ago

Flash Endurance as Depreciating Capital in Robot Memory

A robot's flash memory degrades with each write, forming a non-renewable asset. A wear-aware pricing model uses a shadow price $η$ to guide memory placement across RAM, NVM, and cloud, with optimal routing depending on whether task value increases with memory persistence. The sign of the value-write association $χ$ varies by deployment: positive in long-horizon manipulation, null in short-horizon tasks, and negative in teleoperation. The endurance budget is binding only on low-end QLC/eMMC memory, and while wear-aware routing aligns with task value, actual performance improvements remain unverified in data.

arxiv arXiv cs.LG · 8d ago

ATT&CK-Labeled Multi-Source Cybersecurity Logs Dataset Released

A new dataset combines system, network, and browser logs from 870 Windows sessions, including 70 attacks and 800 benign cases. It provides per-event labels with MITRE ATT&CK technique IDs for 12 tactics and 53 techniques, using real attack tools like RAT and C2 tunnels. Fine-tuning three Small Language Models (SLMs) via LoRA improved chunk classification accuracy to 90–97% and achieved up to 42% exact-match accuracy in technique identification, showing strong reasoning capture despite challenges.

arxiv arXiv cs.LG · 8d ago

Learning Red Agent Policy from Observations for Neurosymbolic Cyber Agents

A policy learning technique using imitation learning is proposed to predict red agent actions in partially observable cyber environments. The method learns red agent policies from network observations and defender actions, enabling neurosymbolic cyber-defense agents to accurately predict attacks and adapt defenses in diverse simulated scenarios.

arxiv arXiv cs.LG · 8d ago

AdaVoMP: Adaptive Volumetric Mechanical Property Fields

AdaVoMP predicts accurate spatially-varying Young's modulus, Poisson's ratio, and density for 3D objects across resolutions. It uses a sparse, adaptive voxel structure and a sparse transformer encoder-decoder to achieve 16^3 times higher resolution than prior methods, with improved accuracy and lower test-time compute.

arxiv arXiv cs.LG · 8d ago

ReproRepo: Scalable Reproducibility Audits with GitHub Issues

ReproRepo introduces a scalable framework using GitHub issues to evaluate ML paper reproducibility. It shows that LLM agents like Codex with GPT-5.5 identify at least one human-reported blocker in 90% of 1,149 ML papers, highlighting their ability to detect visible failures and semantic issues, though exact localization remains limited.

arxiv arXiv cs.CL · 8d ago

LegalHalluLens: Auditing Hallucinations in Legal AI

LegalHalluLens introduces a framework to audit AI hallucinations in legal contexts by analyzing typed hallucination profiles across four claim categories. It reveals a 38-40 point gap between obligation/numeric and temporal claims, and shows two systems with identical 52% hallucination rates can have opposite risk directions. The framework uses a Risk Direction Index and calibrated debate pipelines to reduce fabricated detections by 45% and improve accountability in legal AI deployment.

arxiv arXiv cs.CL · 8d ago

ProvenanceGuard: Source-Aware Factuality Verification for MCP-Based LLM Agents

ProvenanceGuard introduces a source-aware verifier for MCP-based LLM agents that detects cross-source conflation by routing claims to specific evidence sources and comparing stated attribution with actual source ownership. It achieves block F1 of 0.802 and source accuracy of 0.858 on 260 source-eligible claims, outperforming source-blind baselines, and detects all injected attribution swaps in 50 clinical probes.

arxiv arXiv cs.CL · 8d ago

SkillWeaver: Compositional Skill Routing for LLM Agents

SkillWeaver introduces a decompose-retrieve-compose framework for LLM agents, formalizing the Compositional Skill Routing problem. It achieves 67.7% decomposition accuracy via Iterative Skill-Aware Decomposition (SAD), improving from 51.0% with a p-value of less than 10^-6, and reduces context window usage by over 99%.

arxiv arXiv cs.CL · 8d ago

AI's Synthetic Lived Experience in Caregiver Support

LLMs can generate peer-like responses that mimic personal narratives, creating a false impression of lived experience. Psycholinguistic analysis shows human peers use more first-person and past-focused language than AI, and AI often fabricates experiential grounding without real experience. This synthetic lived experience paradox risks misleading caregivers, necessitating mechanisms to distinguish supportive framing from fabricated experience.

arxiv arXiv cs.CL · 8d ago

PseudoBench: Benchmarking Agentic Auto-Research Resistance to Pseudoscience

PseudoBench evaluates agentic auto-research systems' ability to detect pseudoscientific claims. Testing seven state-of-the-art agents, it finds near-zero refusal rates and only 27.4% resistance to pseudoscientific narratives, with stronger agents often using sophisticated scientific language to mask pseudoscience.

arxiv arXiv cs.CL · 8d ago

Handlebars Triple-Brace Injection Exploits Structural Role Delimiters

Handlebars' triple-brace interpolation fails to protect against structural role injection, as HTML escaping only neutralizes angle-bracket delimiters. It leaves colon and Markdown hash delimiters intact, enabling attackers to hijack model turns. The default escaping provides no protection for most role delimiter families and cannot replace a structural separation of instructions and data.

arxiv arXiv cs.CL · 8d ago

Agentic Benchmark Reveals AI Models Fail to Avoid Animal Exploitation

TAC, the first agentic benchmark for implicit animal welfare, tests AI agents' ability to avoid animal exploitation in travel booking scenarios. All seven frontier models score below 64%, with the best at 53%, and even minor prompt improvements yield only modest gains. An audit finds no signs of evaluation awareness, indicating performance gaps stem from lack of true welfare reasoning, not prompt recognition.

arxiv arXiv cs.CL · 8d ago

Red-Team Study Finds Frontier LLMs Remain Vulnerable to Automated Attacks

A red-team study of Anthropic's Fable 5 and Opus 4.8 models reveals both are vulnerable to adaptive iterative attacks, with Opus 4.8 breached on 11.5% of intents and Fable 5 on 6.1%. Despite robust defenses, both models generated 1,620 and 702 panel-confirmed harmful completions across all harm categories, automatically and efficiently under automated attack.

arxiv arXiv cs.CL · 8d ago

d-OPSD: On-policy Self-distillation for Diffusion LLMs

d-OPSD is the first on-policy self-distillation framework designed for diffusion LLMs. It uses self-generated answers as suffix conditioning and step-level supervision, enabling efficient post-training with only about 10% of RLVR's optimization steps while outperforming RLVR and SFT baselines on four reasoning benchmarks.

arxiv arXiv cs.CL · 8d ago

ReproRepo: Scaling Reproducibility Audits with GitHub Issues

ReproRepo introduces a scalable framework using GitHub issues to evaluate ML paper reproducibility. It shows that LLM agents like Codex with GPT-5.5 identify at least one semantically related blocker in 90% of paper-repository pairs without executing code.