AI agents — korshunov.ai

AI agents Page 1 / 20

Self-Evolving Cognitive Framework for Embodied Scientific Intelligence

The paper proposes a self-evolving cognitive framework that uses causal world modeling to enable embodied systems to continuously refine their internal models through interaction. It integrates causal modeling, intervention-driven reasoning, and continual refinement, redefining embodied interaction as an epistemic process for causal discovery and knowledge acquisition. The framework supports a shift from predictive to epistemic intelligence, with a new benchmark for evaluating self-evolving embodied scientific intelligence.

arxiv arXiv cs.AI · 20h ago

VADAOrchestra: Neurosymbolic Orchestration of Adaptive Reasoning Workflows

VADAOrchestra introduces a neurosymbolic framework that combines LLM-based workflow orchestration with Datalog+/- symbolic reasoning. It enables adaptive, explainable decision-making by incrementally planning workflows and executing logical inference on demand, offering verifiable traces, auditability, and scalability over large datasets.

arxiv arXiv cs.AI · 20h ago

SCOPE: Self-Adaptive Symbolic Planning for Open-Ended Environments

SCOPE introduces a framework that refines action plans and evolves symbolic world models in open-ended environments. It combines a Symbolic Execution Simulator and a Self-Adaptive Symbolic Memory to improve plan completeness, perturbation resilience, and cross-task adaptability.

arxiv arXiv cs.AI · 20h ago

LLM-Orchestrated Agent for SOI Directional Coupler Design

A large language model orchestrates the design of a silicon-on-insulator 2x2 directional coupler by proposing gap values and assessing convergence. The design is validated through eigenmode and FDTD simulations on a common 2D effective-index model, showing a consistent phase offset of 2.837(11) micrometers that is corrected in a closed-loop process. The final device achieves a 50/50 split with a cross fraction of 0.498, within 0.0017 of the target.

lab Mistral AI News · 21h ago

New Connector Controls for Enterprise Security and Access

Mistral Studio now offers enriched admin controls to govern connector access per workspace and tool, enabling fine-grained permissions. Features include API keys with scopes, multi-account connectors, and a new Connectors Debugger for root cause analysis, all supporting secure, auditable integration with enterprise systems.

media Hugging Face Forums · 21h ago

Aiden Mobile Agent Prototype in the Making

Aiden is a physical AI agent device that monitors a phone's screen via HDMI and controls it through USB HID, enabling app automation without jailbreak or installed software. It supports bring-your-own LLMs, operates without backend infrastructure or data collection, and is released under the AGPL license as an open-source development board.

arxiv arXiv cs.AI · 21h ago

Grounded Scaling: Determinism as a Core Limit in Agentic AI

Agentic AI performance degrades exponentially in non-deterministic environments, with k-step success falling as δ^k when per-step determinism δ < 1. The paper introduces a framework linking environment determinism to task success, verifiability, and skill evolution, proposing a Supply Certainty Index and a five-level Determinism Maturity Model. It challenges prevailing views by identifying determinism as a binding constraint across compute, data, embodiment, and alignment.

arxiv arXiv cs.AI · 22h ago

Gazer: Training-Free Semantic Correction for Autoregressive Visual Models

Gazer introduces a training-free framework that uses multimodal large language model feedback to correct semantic errors in real time during autoregressive visual model generation. By integrating reflective diagnosis and semantic correction stages, Gazer improves compositional accuracy and semantic alignment across multiple models without additional training.

arxiv arXiv cs.AI · 22h ago

MacAgentBench Launches macOS AI Agent Benchmark

MacAgentBench introduces a comprehensive benchmark with 676 tasks across 25 applications, 60% of which involve both GUI and CLI interactions. It uses deterministic rule-based evaluation and fine-grained multi-checkpoint scoring, revealing that Claude Opus 4.6 on OpenClaw achieves 73.7% Pass@1, primarily due to its skill library rather than framework design.

media r/LocalLLaMA · 22h ago

Nex-N2-Mini-Ultra-Uncensored-Heretic Model Released

The Nex-N2-Mini-Ultra-Uncensored-Heretic model is now available, featuring agentic thinking with 5/100 refusals and a KLD of 0.0020. It is released in both Safetensors and GGUF formats and is accessible via Hugging Face. The creator notes that Heretic 1.2.0 was chosen over 1.4.0 due to better performance in avoiding high KLD and maintaining low refusal thresholds.

arxiv arXiv cs.AI · 1d ago

PaperClaw: Autonomous Research with Human-in-the-Loop Refinement

PaperClaw is a multi-agent system that autonomously conducts research from field selection to paper publication. It uses a validated, iterative propose-test-reflect loop, grounded in real references and runnable results, and supports human-in-the-loop refinement at any stage. Evaluation shows it produces strong papers both autonomously and with human oversight.

arxiv arXiv cs.LG · 1d ago

DataClaw0: Agentic Tailoring of Multimodal Data from Raw Streams

DataClaw0 introduces an agentic paradigm for actively refining raw multimodal data to align with user and downstream intents. It uses a two-stage pipeline grounded in factual anchors to generate a large-scale dataset across five domains and combines supervised fine-tuning with GRPO to achieve strong alignment with complex refinement tasks. Evaluated on video generation, VQA, and GUI navigation, DataClaw0 produces high-information-density tailored data, enabling efficient model adaptation with minimal training data.

arxiv arXiv cs.LG · 1d ago

Neural Action Codec for Vision-Language-Action Models

NAC, a neural audio codec-inspired architecture, compresses robot action trajectories as multi-channel 1D signals using multi-scale residual vector quantization. By replacing mel-spectrogram losses with time-domain and non-mel spectral reconstruction, NAC achieves high-fidelity action encoding with minimal architectural changes, outperforming existing tokenizers in reconstruction error and success rates on real-world manipulation tasks.

arxiv arXiv cs.LG · 1d ago

VLA-FAIL: Lightweight Failure Detection for Vision-Language-Action Models

VLA-FAIL introduces a lightweight, failure detection framework for vision-language-action models that uses last-layer Mahalanobis distance and action chunk consistency without requiring failure data or expensive action sampling. The framework combines these detectors to achieve reliable, early failure detection across diverse tasks, outperforming baseline methods in both accuracy and efficiency.

arxiv arXiv cs.LG · 1d ago

LDT-FRL Framework for Cyber-Resilient IoMT

The LDT-FRL framework introduces a privacy-preserving defense system for IoMT devices, combining temporal attention, lightweight digital twins, and federated reinforcement learning. It achieves 99.66% and 99.95% accuracy on CICDDoS 2019 and TON-IoT benchmarks, with perfect F1 on the MITM class, converging 81% faster than prior methods and offering interpretable defense decisions via SHAP and Grad-CAM.

arxiv arXiv cs.LG · 1d ago

ASCII Art Enables Text-Only LLMs to Control VLA Systems

A text-only large language model can be adapted into a Vision--Language--Action controller by using ASCII-rendered visual observations. This approach allows LLMs to interpret visual states through text, enabling them to follow natural-language instructions and generate executable actions in both simulation and on physical manipulators.

arxiv arXiv cs.LG · 1d ago

Decoupling Declarative and Procedural Knowledge in Vision-Language-Action Models

w$^{2}$VLA introduces a modular approach that decouples declarative and procedural knowledge in Vision-Language-Action models. By restructuring information flow, it enables robust behavior cloning and unprecedented zero-shot skill transfer across unseen, dissimilar objects.

media Hugging Face Forums · 1d ago

I Built an MCP Server in Go for AI Agents - 200 Lines Tutorial

A 200-line Go tutorial demonstrates building a lightweight Model Context Protocol server using Go's concurrency and simplicity. The server enables AI agents like Claude to access structured data and Go applications, potentially making them 10x more useful.

media r/LocalLLaMA · 1d ago

Qwen Releases Qwen-AgentWorld-397B-A17B Model

Qwen has announced a new large language model called Qwen-AgentWorld-397B-A17B. The model is mentioned on Hugging Face and Qwen's official blog, indicating its public release and availability for use.

media r/LocalLLaMA · 1d ago

GitHub Repository: Qwen-AgentWorld for Language World Models

Qwen-AgentWorld is a GitHub repository introducing language world models designed for general-purpose agents. The project aims to enable agents with broader, more realistic world understanding through language-based modeling.