All articles — korshunov.ai

All articles Page 1 / 123

Autonomous Video Generation with Counterfactual Controllability for Self-Evolving World Models

The article argues that current video generation models learn only partial, implicit spatiotemporal world models rather than fully grounded or controllable ones. It asserts that predictive realism alone is insufficient for creating physical agents because these models often fail to identify controllable variables and embodiment constraints.

arxiv arXiv cs.LG · 4h ago

BehaviorBench: Benchmarking Foundation Models for Behavioral Science Tasks

The authors introduce BehaviorBench, a comprehensive benchmark designed to evaluate foundation models across diverse behavioral science tasks and populations. The study assesses four core capabilities—behavior prediction, strategic decision-making, subject-trait inference, and behavioral knowledge application—at both individual and distributional levels.

arxiv arXiv cs.LG · 4h ago

A Pāninian Foundation for Indic Language Processing

The article argues that natural language processing infrastructure for the billion-plus speakers of Indic languages is fragmented due to a lack of shared structural foundations. It proposes leveraging the morphosyntactic architecture formalized in Pānini's Astādhyāyī as a unifying computational framework to improve accuracy and data efficiency.

arxiv arXiv cs.LG · 4h ago

Lightweight Transformer Models for On-Device Fault Detection: A Benchmark Study on Resource-Constrained Deployment

This study benchmarks traditional machine learning methods against lightweight transformer architectures for binary fault detection across three public datasets, evaluating tradeoffs between accuracy, model size, and latency. The research assesses classification performance using F1-score and AUC, while also testing INT8 dynamic quantization and a two-stage adaptive inference pipeline to optimize deployment on resource-constrained hardware.

arxiv arXiv cs.LG · 4h ago

Project Ariadne: Prompt-Conditioned Route Generation for Synthesis Planning

Researchers introduce Ariadne, a decoder-only model that reframes retrosynthetic planning as prompt-conditioned sequence generation, allowing target molecules, constraints, and routes to be represented in a single sequence. This approach eliminates the need for separate models tailored to specific planning specifications.

arxiv arXiv cs.LG · 4h ago

Automated Residual Plot Assessment With the R Package autovi and the Shiny Application autovi.web

The article introduces an R package and a Shiny application designed to automate the visual assessment of residual plots for linear models, addressing the scalability and consistency issues inherent in manual evaluation.

media r/LocalLLaMA · 4h ago

shoutout to /u/TheDankestSlav for this gem

This Reddit post from r/LocalLLaMA is a simple shoutout to user /u/TheDankestSlav. It links to an image shared by the user, which is described as a "gem".

media r/LocalLLaMA · 4h ago

Reddit user criticizes Dario Amodei's claims about open source AI

A Reddit user argues that Anthropic CEO Dario Amodei fundamentally misunderstands how open-source AI models work, specifically refuting his recent congressional testimony from June 28, 2026. The author contends that Amodei's assertions regarding transparency and accessibility are factually incorrect based on the current state of open-weight models.

lab Claude Code Releases · 4h ago

Claude Code v2.1.196 Release Notes

Claude Code version 2.1.196 introduces organization default models, clickable file attachments, and improved security for MCP server approvals. The update also enhances background session reliability, fixes various agent status reporting issues, and optimizes token usage in code review workflows.

arxiv arXiv cs.LG · 5h ago

MotifGen: Spatiotemporal interpolation of misaligned satellite images via multi-source generative modeling

Researchers introduce MotifGen, a generative model designed for the spatiotemporal interpolation of tropical cyclone microwave images from multiple geospatial sources with irregular time intervals and geographic misalignment. The model addresses the challenge of high heterogeneity in microwave data by combining inputs from various instruments to fill gaps caused by long satellite revisit times.

arxiv arXiv cs.LG · 5h ago

Deep numerical schemes for systems of Ergodic BSDEs with applications to regime-switching forward utilities

This paper introduces two neural-network-based numerical schemes for solving systems of coupled ergodic Backward Stochastic Differential Equations (eBSDEs), motivated by approximating optimal strategies in regime-switching stochastic factor models.

arxiv arXiv cs.LG · 5h ago

PROTECT-90: A Fault Dataset for Power System Protection

This paper introduces the PROTECT-90 dataset, an open electromagnetic transient (EMT)-simulated reference benchmark designed to address the lack of standardized, publicly available high-voltage waveform datasets for power system protection. The release aims to enable transparent and reproducible evaluation of data-driven methods through consistent digital-fault-recorder-like measurements.

arxiv arXiv cs.LG · 5h ago

Managing Task Execution for Unknown Workloads in Batteryless IoT: A Hardware-Agnostic Evaluation

This study proposes two hardware-agnostic dynamic scheduling strategies, a model-free Reinforcement Learning agent and an on-the-fly Approximated Prediction method, to manage volatile energy in batteryless IoT systems without prior task profiles. Evaluated against adaptive and static baselines using a custom simulation framework, the research highlights distinct operational trade-offs for different system constraints.

arxiv arXiv cs.LG · 5h ago

Open-Vocabulary BEV Segmentation with 3D-Aware Geometric Constraints

The authors introduce OVBEVSeg, a framework for open-vocabulary bird's-eye view (BEV) segmentation that utilizes vision-language models to recognize categories beyond the training set while maintaining real-time efficiency. To address the 3D geometric inconsistency inherent in lifting 2D semantics into BEV, the method employs robust 3D geometric constraints across three progressive stages.

arxiv arXiv cs.LG · 5h ago

PHANTOM: A Large-Scale Dataset of Multimodal Adversarial Attacks for Vision-Language Models

The authors introduce PHANTOM, a large-scale open-source dataset containing 47,524 pre-generated adversarial attacks designed to evaluate the safety and robustness of vision-language models (VLMs). This resource consolidates existing benchmarks and extends them with new categories to provide diverse and practical evaluation data for the research community.

arxiv arXiv cs.LG · 5h ago

Parallel Manifold Steering: Efficient Adaptation of Large Associative Memories via Residual Energy Shaping

The authors propose H-Res (Hierarchical Residual Steering), a mechanism that adapts large Transformer models by modulating their effective energy landscape without altering global equilibrium or expanding sequence length. This approach formulates adaptation as a control problem on the activation manifold to steer token trajectories into task-specific basins of attraction.

arxiv arXiv cs.LG · 5h ago

RE4: Transformation-aware Imitation of Object Interactions Using Manipulation Modes

This paper introduces RE4, a framework for imitation learning that combines principled manipulation theories with modern benchmarks to preserve both performance and interpretability in object interaction tasks. The approach utilizes lightweight, self-supervised pose estimation and mode-aware transformations to retrieve and replan demonstrations effectively.

media r/LocalLLaMA · 5h ago

Introducing LongCat-2.0, a large-scale MoE language model

LongCat-2.0 is introduced as a large-scale Mixture of Experts (MoE) language model featuring 1.6 trillion total parameters with approximately 48 billion activated per token.

arxiv arXiv cs.LG · 6h ago

Natural Identifiers for Privacy and Data Audits in Large Language Models

This work introduces natural identifiers (NIDs), which are structured random strings like cryptographic hashes and shortened URLs found in LLM training data, to address the challenges of auditing large language model privacy. NIDs enable scalable, post-hoc differential privacy auditing without costly retraining and facilitate dataset inference without requiring private held-out datasets.

arxiv arXiv cs.LG · 6h ago

Data Augmentation: A Fourier Analysis Perspective

This article investigates whether partial data augmentation can achieve the same statistical benefits as full augmentation by developing a framework using Fourier analysis and representation theory of finite groups.