All articles — korshunov.ai

All articles Page 1 / 102

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

The authors propose OPID, a framework that extracts skill supervision directly from completed on-policy trajectories to address the sparse reward problem in outcome-based reinforcement learning. By representing trajectory hindsight as hierarchical skills, OPID provides dense, distribution-matched token-level supervision without relying on external memory.

arxiv arXiv cs.CL · 5h ago

Computational Study of Lexical Transmission Across Bengali Devotional Traditions

A computational corpus study analyzes vocabulary relationships across eight layers of Bengali and Sanskrit devotional literature from the 8th to 19th centuries, quantifying the historical claim that Buddhist Vajrayana vocabulary was absorbed into the Shakta Tantra tradition. Using TF-IDF character n-gram vectorization on 75 texts, the research provides the first quantitative corroboration of this lexical transmission chain.

arxiv arXiv cs.CL · 5h ago

KARLA: Knowledge-base Augmented Retrieval for Language Models

The authors propose KARLA, a method enabling large language models to automatically retrieve factual knowledge from an external knowledge base during token generation. This approach allows factual updates without retraining the model and ensures that outputs are traceable to the source data.

arxiv arXiv cs.CL · 5h ago

FBK's Long-form SpeechLLMs for IWSLT 2026 Instruction Following

This paper details FBK's submission to the IWSLT 2026 Instruction Following shared task, presenting SpeechLLMs designed for both short-form and long-form speech instruction following under constrained settings.

arxiv arXiv cs.CL · 5h ago

AgentX: Towards Agent-Driven Self-Iteration of Industrial Recommender Systems

AgentX is a production-deployed multi-agent system designed to automate the iteration of industrial recommender systems, addressing the bottleneck where innovation currently scales linearly with human headcount.

arxiv arXiv cs.CL · 5h ago

Cascaded Multi-Granularity Pruning for On-Device LLM Inference in Industrial IoT

This article introduces a cascaded multi-granularity pruning framework designed to deploy large language models on Industrial Internet of Things (IIoT) edge devices by removing layers, attention heads, and feed-forward channels in a coarse-to-fine order. The method utilizes lightweight low-rank recovery between stages to re-estimate component importance, addressing the collapse of existing structured pruning methods at high compression ratios.

arxiv arXiv cs.CL · 5h ago

InfoKV: Information-Aware KV Cache Compression for Long Reasoning

Researchers introduce InfoKV, an entropy-aware framework that compresses key-value caches by combining token-level predictive uncertainty with attention scores to improve long-context reasoning.

arxiv arXiv cs.CL · 5h ago

Heterogeneous Neural Predictivity from Language Models During Naturalistic Comprehension

This study demonstrates that frozen language models can serve as effective neural predictors for brain activity during natural speech and text comprehension, while distinguishing predictive utility from claims about shared neural organization. The analysis of MEG and ECoG data revealed widespread positive prediction gains over low-level baselines, though participant-level advantages were localized rather than uniform.

arxiv arXiv cs.CL · 5h ago

SamaVaani: Auditing and Debiasing Multilingual Clinical ASR for Indian Languages

This study audits the reliability of eight state-of-the-art Automatic Speech Recognition models on real-world psychiatric interview data in Kannada, Hindi, and Indian English. The results reveal substantial variability across models and languages, with some systems performing competitively in Indian English but failing in regional speech.

arxiv arXiv cs.CL · 5h ago

GAVEL: Grounded Caption Error Verification and Localization

Vision-language models frequently generate hallucinated outputs where text and images are misaligned, necessitating methods that not only detect these errors but also explain them and localize visual evidence. The authors introduce GAVEL, a task designed to jointly address verification, explanation, and localization for image-text pairs, accompanied by a corresponding dataset and benchmark.

arxiv arXiv cs.CL · 5h ago

Jailbreaking for the Average Jane: Choosing Optimal Jailbreaks via Bandit Algorithms

This study investigates whether non-expert malicious actors can successfully jailbreak large language models by using bandit algorithms to select optimal attacks and enhance queries. The authors propose a novel attack strategy based on the multi-armed bandit framework to efficiently learn the best jailbreak from a large choice set through noisy exploration.

arxiv arXiv cs.CL · 6h ago

Term-Centric Hierarchy Induction from Heterogeneous Corpora

Researchers propose a term-centric framework for inducing hierarchical taxonomies from diverse text sources, addressing the limitations of existing methods that rely on document-level representations. This approach maps documents into a shared representation space via automatic term extraction to enable robust cross-source alignment and construct interpretable hierarchies.

arxiv arXiv cs.CL · 6h ago

RedVox: Safety and Fairness Gaps in Speech Models Across Languages

A new study reveals significant safety and fairness gaps in multilingual speech models, finding that only 8% of state-of-the-art releases document any multilingual analysis. To address this, the authors introduce RedVox, a benchmark built on real voices covering unsafe requests across five languages.

arxiv arXiv cs.CL · 6h ago

Einstein World Models: Visualizing Counterfactuals for LLM Reasoning

The article introduces Einstein World Models (EWMs), a framework designed to enhance large language model reasoning by integrating visual-temporal rollouts into the reasoning trace. This approach allows models to utilize visual thought experiments as inspectable hypotheses to complement text-based processing.

arxiv arXiv cs.CL · 6h ago

Auditing Framing-Sensitive Behavioral Instability in LLMs for Mental Health

This study investigates how semantically similar concerns presented through different contextual framings elicit varying responses from instruction-tuned large language models, potentially challenging system reliability. Using controlled matched prompts and layer-wise probing analyses, the authors demonstrate that framing systematically alters interpretive response tendencies across multiple model architectures.

arxiv arXiv cs.CL · 6h ago

ReaORE: Reasoning-Guided Progressive Open Relation Extraction Empowered by Large Reasoning Models

Researchers propose ReaORE, a framework for open relation extraction that utilizes large reasoning models to achieve reliable generalization to unseen relation types. The method addresses limitations of current clustering and direct generation approaches through a coarse-to-fine reasoning process.

arxiv arXiv cs.CL · 6h ago

Where Do Models Find Happiness? Emotion Vectors in Open-Source LLMs

This study investigates the presence and structure of emotion vectors in open-weight large language models, specifically Apertus-8B-Instruct-2509 and Gemma-4-E4B-it. The research confirms that these models encode valence geometry with high correlation to human psychological structures, approaching the levels previously observed in Claude Sonnet 4.5.

arxiv arXiv cs.CL · 6h ago

MinGram: A Minimalist Unigram Tokenizer with High Compression and Competitive Morphological Alignment

The authors introduce MinGram, a minimalist unigram tokenizer that simplifies training by using a BPE-derived seed vocabulary, Hard EM on a minimum-token path, and a single flat score-pruning step. This approach removes the need for suffix arrays, forward-backward passes, and iterative prune loops, making the procedure significantly less complex than standard methods.

arxiv arXiv cs.CL · 6h ago

Improving Verbalized Uncertainty Calibration in Medical VQA

This work addresses the tendency of multimodal large language models to produce overconfident outputs in Medical Visual Question Answering by proposing a training-based framework that finetunes these models for better calibration. The method employs a composite loss function combining Brier-style calibration, anchor regularization, contrastive image-text alignment, and KL divergence terms to align model confidence with actual correctness.

arxiv arXiv cs.CL · 6h ago

Improving General Role-Playing Agents via Psychology-Grounded Reasoning and Role-Aware Policy Optimization

Researchers propose Psy-CoT, a psychology-grounded chain-of-thought framework that decomposes pre-response reasoning into Interaction Perception, Psychological Empathy, and Logical Construction to improve character fidelity. To address gradient misalignment in reinforcement learning, they introduce Role-Aware Policy Optimization (RAPO), which uses profile-token mutual information to weight gradients asymmetrically.