Research paper — korshunov.ai

Research paper Page 1 / 17

Introducing Quantum Measurement Temperature to Stabilize Hybrid QNN Training

A learnable scaling parameter called Quantum Measurement Temperature (QMT) is introduced to rescale quantum measurement outputs in hybrid quantum neural networks. This approach mitigates measurement-induced logit contraction, enhancing gradient magnitude and stability during training without altering the quantum circuit or measurement operators. Experiments show improved logit separation, gradient strength, and classification accuracy in protein and image classification tasks.

arxiv arXiv cs.LG · 18h ago

Deep material network for homogenization of piezoelectric composites

A piezoelectric deep material network (PDMN) is proposed to efficiently homogenize two-phase piezoelectric composites. The framework embeds electromechanical homogenization relations into its architecture, enabling physics-informed, semi-analytical predictions with over three orders of magnitude lower computational cost than direct numerical simulation, validated on PVDF-LiNbO3 and viscoelastic-piezoelectric composites under nonlinear loading.

arxiv arXiv cs.LG · 18h ago

Concept-Constrained Prompt Learning for Few-Shot CLIP Adaptation

CCPL introduces a lightweight framework that anchors class prompts to frozen concept prototypes, improving few-shot CLIP adaptation. It achieves better base-to-new performance on DTD and EuroSAT compared to CoOp, with consistent gains from text-space concept regularization, though results vary by dataset and protocol.

arxiv arXiv cs.LG · 18h ago

Stationary Robust Mean-Field Games under Model Mismatches

This paper introduces a stationary mean-field game framework that directly incorporates distributional model uncertainty into population-coupled dynamics. It establishes a robust dynamic programming principle, proves existence of a stationary robust equilibrium, and presents the first algorithm with convergence guarantees. The mean-field solution approximates finite-population equilibria and provides explicit non-asymptotic error bounds under model uncertainty.

arxiv arXiv cs.LG · 18h ago

Training-Free Task Classification for Multi-Task Model Merging

SiM enables dynamic routing in multi-task model merging without additional training or task ID access. It uses SVD-based manifold approximations and projects test inputs onto precomputed task manifolds to route inputs to relevant experts, improving performance and reducing the gap to individual expert levels.

arxiv arXiv cs.LG · 18h ago

Importance-Weighted On-Policy Distillation Addresses Position Bias

On-Policy Distillation (OPD) suffers from position bias where later tokens provide poor supervision. We introduce Importance-Weighted On-Policy Distillation (IW-OPD), which assigns weights based on distribution discrepancy, prioritizing early tokens. IW-OPD converges faster and achieves up to 6.9 point performance gains on AIME-2025.

arxiv arXiv cs.LG · 18h ago

Scalable Bayesian Models for Stellar Flare Detection

A generative surrogate framework using a Variational Autoencoder approximates Gaussian Process priors, bypassing costly covariance operations. The VAE+Hidden Markov Model architecture enables fast, scalable stellar flare detection in large astronomical time series, matching exact models in structural fidelity while reducing computational time significantly.

arxiv arXiv cs.AI · 19h ago

Geometry-Aware Online Scheduling for LLM Serving

A new scheduling algorithm, Smallest Volume First (SVF), reduces LLM inference latency by optimizing key-value cache management. Theoretical analysis shows a worst-case competitive ratio reduced from 48 to 5, with 1-bit SVF achieving strong performance using minimal information. Evaluations on Llama-3.1 models confirm improvements in both average and tail latency, with the approach integrated into vLLM.

arxiv arXiv cs.AI · 19h ago

Hypothesis-Driven Skill Optimization for LLM Agents

HDSO enables safe, auditable skill updates for LLM agents without training, using falsifiable hypotheses and validation. On ALFWorld, it improves Qwen3-8B by +6.9 Avg. SR points and maintains a +7.1-point gain under noisy feedback, with validated skills transferable across runs and models when diagnostic alignment is achieved.

arxiv arXiv cs.AI · 19h ago

Flow Annealing Posterior Sampling for Function-Space Regression and Inverse Problems

FAPS is the first function-space posterior sampling framework that unifies stochastic-process regression and PDE inverse problems. It uses pretrained flow-matching priors and Langevin correction with low-rank covariance preconditioning to enable efficient, accurate posterior inference from sparse, noisy data with coherent uncertainty quantification.

arxiv arXiv cs.AI · 20h ago

Select-to-Act: Hierarchical RL with Adaptive Language Guidance

HRLLI introduces a hierarchical reinforcement learning framework that adapts natural-language instructions dynamically during decision-making. It decomposes instructions into stage-specific guidance elements and uses a select-to-act paradigm to enable real-time selection of relevant instruction pieces, improving sample efficiency and performance in complex environments.

arxiv arXiv cs.AI · 20h ago

SAFER: Reliable Test-Time Adaptation under Adversarial Streams

SAFER is a training-free framework that enhances robustness of test-time adaptation by using reliability-guided augmentation. It generates stochastic augmentations, pools predictions via correlation-weighted aggregation with outlier detection, and includes adaptive mixing to preserve clean performance under adversarial attacks. Evaluations on PACS, VLCS, and OfficeHome show improved resilience without sacrificing clean accuracy.

arxiv arXiv cs.AI · 20h ago

Sparsity-Storage-Accuracy Tradeoff in Parsimoniously Activated Dictionary Learning

Parsimoniously activated dictionary learning (PADL) establishes a structured generative model with auxiliary latent variables, enabling maximum a posteriori estimation. This framework provides generalization guarantees and an analytical characterization of the tradeoff between sparsity, storage cost, and reconstruction accuracy, allowing data-driven hyperparameter estimation. The resulting algorithm achieves better reconstruction performance and accelerates inference in vision-language models.

arxiv arXiv cs.AI · 20h ago

First-Token Broadcasters in Transformers: Language Identity and Robustness

LIHA reveals a small set of first-token broadcaster heads in GPT-2 that persistently attend to the initial prompt token, driving language switches. Instruction tuning reorganizes these circuits, concentrating language identity at early layers, as seen in Qwen2.5-1.5B-Instruct and confirmed in Chinese and Russian language handling at layer 0.

arxiv arXiv cs.AI · 20h ago

ARIA: A Causal-Aware Framework for Rescuing LLM Reasoning

ARIA addresses contextual tunneling in LLMs by conditioning knowledge use on mechanistic completeness. It uses a three-tier cascade for causal reasoning, physics-informed transfer, and parametric fallback, and improves materials discovery through auditable, physically grounded reasoning.

arxiv arXiv cs.AI · 20h ago

HyperAdapter: Structured Hyperedge Adaptation for Vision Transformer Fine-Tuning

HyperAdapter introduces a hypergraph-based adapter that performs structured, group-aware adaptation in vision transformers by operating in hyperedge space rather than token space. It uses prototype-based assignments to build a soft hypergraph, aggregates token features into hyperedge representations, applies lightweight adaptation, and diffuses updates back via hypergraph structure, enabling explicit structural inductive bias while maintaining efficiency. Experiments show consistent performance gains over baseline PEFT methods, especially on tasks requiring structured reasoning.

arxiv arXiv cs.AI · 20h ago

MetaPS: Adaptive Strategy Selection for Market Agents

MetaPS is a simulation-guided framework that enables market agents to adaptively select among programmatic strategies based on market states. It uses simulated markets to generate supervised training data, then selects strategies during inference to produce executable actions. Experiments show MetaPS outperforms fixed strategies and LLM-based agents, with compact models exceeding stronger API models in performance.

arxiv arXiv cs.AI · 20h ago

P4IR Framework Improves LLM-Based Code Compliance Accuracy

P4IR, a two-stage framework, uses supervised fine-tuning and Group Relative Policy Optimization to enhance large language model-based automated code compliance systems. It reduces tree edit and token-level Levenshtein distances by up to 23.8% and 38.6% respectively, outperforming leading LLMs like Claude Opus, GPT-5.2, and GLM-4.7 in zero-shot settings with few-shot prompting, and reduces false positives by a small but statistically significant margin.

arxiv arXiv cs.AI · 20h ago

Gold Points Sniper: Self-guided Visual Reasoning for Fine-grained Action Understanding

Gold Points Sniper (GPS) enables lightweight vision-language models to perform self-guided multimodal reasoning for fine-grained human action understanding. By integrating a Gold Points Extractor, Selective Socratic Questioner, and Semantic Entailment Evaluator, GPS achieves performance comparable to GPT-4o while maintaining superior factual accuracy on CAP benchmark-based instruction-tuning data.

lab Hugging Face Blog · 20h ago

NVIDIA NeMo AutoModel Speeds Up Transformer Fine-Tuning

NVIDIA's NeMo AutoModel enables faster fine-tuning of transformer models by automating model selection and optimization. It reduces development time and improves efficiency in training large language models on NVIDIA hardware.