All articles — korshunov.ai

All articles Page 1 / 130

RECALL: Recovery Experience Collection for Active Lifelong Learning in Vision-Language-Action Models

This paper proposes an active, continual learning paradigm for Vision-Language-Action (VLA) models to address the inefficiencies of passive imitation learning. The authors demonstrate that uncertainty-guided data collection improves fine-tuning efficiency but causes catastrophic forgetting when recovery data is used exclusively.

arxiv arXiv cs.LG · 3d ago

DiT-Reward: Generative Representations for Text-to-Image Reward Modeling

The article introduces DiT-Reward, a method that converts a pretrained text-to-image Diffusion Transformer into a reward model by processing near-clean image latents and aggregating text-conditioned representations across transformer layers. This approach leverages generative representations to evaluate the quality of generated images without requiring separate training objectives.

arxiv arXiv cs.LG · 3d ago

Muown Implicitly Performs Angular Step-size Decay

The article demonstrates that Muown's directional update is equivalent to a Riemannian step on normalized directions, where the un-normalized parameterization magnitude modulates the angular step size. This insight explains Muown's step-size stability and motivates the development of AngularMuown, which optimizes directly over normalized directions with an explicit, schedulable angular multiplier.

arxiv arXiv cs.LG · 3d ago

Learning Process Rewards via Success Visitation Matching for Efficient RL

The authors propose a method to transform inherently sparse outcome rewards in reinforcement learning into dense process rewards by training a discriminator to distinguish between successful and unsuccessful episodes. This approach incentivizes the policy to match the state-action visitations of successful episodes while avoiding those of unsuccessful ones, providing dense feedback on progress without altering the optimal policy.

arxiv arXiv cs.LG · 3d ago

Diffusion Models Adapt to Low-Dimensional Structure Under Flexible Coefficient Choices

This paper demonstrates that diffusion models' ability to exploit low-dimensional structure for accelerated sampling is a robust property independent of specific update coefficient choices. The authors prove that a broad class of coefficients allows generating an ε-accurate sample in O(k/ε) iterations, regardless of ambient dimension.

arxiv arXiv cs.LG · 3d ago

Dynamic estimation of slowly varying sequences

This article introduces a framework for sequentially approximating functions in slowly-varying sequences, leveraging the reuse of past queries to reduce overall computational cost. The authors present novel sequential estimation results for matrix powers, spectral densities, Monte Carlo integration, and partial differential equation boundary value problems.

arxiv arXiv cs.LG · 3d ago

Action-BED: Task-Driven Bayesian Experimental Design with Singly Intractable Objectives

The article introduces Action-BED, a new framework for Bayesian experimental design that formulates the problem in terms of expected future loss on downstream actions rather than uncertainty reduction. This approach converts traditionally doubly intractable objectives into singly intractable ones that can be jointly optimized using stochastic gradients.

arxiv arXiv cs.LG · 3d ago

MAS-PromptBench: When Does Prompt Optimization Improve Multi-Agent LLM Systems?

This study systematically investigates the impact of system-prompt optimization on multi-agent systems (MAS) by benchmarking two optimizers across diverse configurations of tasks, workflows, and team sizes.

arxiv arXiv cs.LG · 3d ago

On the Limits of Prompt-Conditioned Language Models as General-Purpose Learners

This paper argues that Large Language Models are not universal problem solvers through prompting alone, due to fundamental constraints in language as a communication interface and alignment requirements. The authors analyze user-system interaction as a cheap-talk game to derive PAC-Bayes bounds distinguishing estimation error from structural limitations.

arxiv arXiv cs.LG · 3d ago

Tapered Language Models: Improving Performance via Depth-Aware Capacity Allocation

The article introduces Tapered Language Models (TLMs), an architectural principle that allocates more parameter capacity to earlier layers and less to later layers within a fixed budget. This approach challenges the standard practice of uniform layer width by leveraging evidence that later layers primarily refine the residual stream rather than transforming it.

arxiv arXiv cs.LG · 3d ago

PsyBridge: A Hybrid Intelligent Framework for Multi-Dimensional Mental Health Assessment

This study introduces PsyBridge, a hybrid intelligent framework designed to address the limitations of isolated mental health screening tools by integrating clinically validated assessments with cognitive and personality profiling. The system utilizes a modular architecture and weighted aggregation mechanism to generate interpretable risk classifications and decision support recommendations.

arxiv arXiv cs.LG · 3d ago

Open Problem: Is AdamW Effective Under Heavy-Tailed Noise?

This article addresses the lack of rigorous convergence theory for the AdamW optimizer in regimes with heavy-tailed stochastic gradient noise, which is common in large language model pretraining. It questions whether AdamW can converge under these conditions or if its second-moment accumulator creates a genuine obstruction.

arxiv arXiv cs.LG · 3d ago

Semantic Browsing: Controllable Diversity for Image Generation

This article introduces Semantic Browsing, a method for generating controlled diversity in text-to-image models by enforcing structure on generated samples to overcome the lack of meaningful variation in current systems. The approach induces diversity directly at the text level rather than relying on stochastic variations within the model.

arxiv arXiv cs.LG · 6d ago

CoorDex: Coordinating Body and Hand Priors for Continuous Dexterous Humanoid Loco-Manipulation

The authors introduce CoorDex, a learning pipeline that enables high-degree-of-freedom dexterous loco-manipulation on moving humanoids by converting body and hand control into coordinated latent residual control. This approach allows the Unitree G1 humanoid to perform complex tasks like non-stop bottle grasping and fridge door opening while in motion.

arxiv arXiv cs.LG · 6d ago

AutoDex: An Automated Real-World System for Dexterous Grasping Data Collection

AutoDex is an automated system designed to close the loop of real-world dexterous grasping data collection by handling perception, execution, labeling, and reset without human intervention. It addresses the scalability issues of teleoperation and the lack of physical certification in simulation by generating candidate grasps and verifying them on real hardware.

arxiv arXiv cs.AI · 6d ago

Adaptive Hard-Soft Physics-Informed Neural Networks for Robust Boundary-Constrained PDE Solving

This study proposes a unified hard--soft physics--informed neural network (HSPINN) with adaptive loss weighting to address the slow convergence and inaccurate boundary enforcement of conventional PINNs. The framework enforces Dirichlet and periodic boundary conditions exactly through analytical lifting or masking, while treating PDE residuals and initial conditions as soft constraints balanced by an inverse-share softmax strategy.

arxiv arXiv cs.AI · 6d ago

Rethinking Molecular Graph Backdoors under Chemistry-aware Admission

The article introduces ChemGuard, an operational protocol that formalizes the overlooked admission stage of molecular learning pipelines by requiring sanitizable strings and consistent graph reconstruction. This framework reveals that many existing graph-based backdoors lose efficacy because their poisons are chemically invalid or representation-inconsistent.

arxiv arXiv cs.AI · 6d ago

Measuring & Mitigating Over-Alignment for LLMs in Multilingual Criminal Law Courts

This article addresses the challenge of over-alignment in large language models used within Swiss Federal Supreme Court criminal law contexts, where model guardrails frequently trigger refusals when processing sensitive case details. The authors introduce TF-RefusalBench, a multilingual benchmark derived from public rulings, to measure this phenomenon across French, German, Italian, and English.

arxiv arXiv cs.AI · 6d ago

Energy-Based Transformers as Predictors of Reading Difficulty

This study introduces energy-based transformers as a novel measure for predicting human reading difficulty, establishing a formal link between transformer models and associative memory literature like Hopfield networks.

arxiv arXiv cs.AI · 6d ago

Distribution-Aware Diffusion-LLM for Robust Ultra-Long-Term Time Series Forecasting

The authors propose Diffusion-LLM, a framework that integrates a conditional diffusion model into an LLM-based pipeline to address challenges in multimodal time series forecasting. This joint design enables the learning of future data distributions while improving semantic alignment within a shared latent space.