All articles — korshunov.ai

All articles Page 1 / 126

Memory-Managed Long-Context Attention: A Preliminary Study of Editable Request-Local Memory

This study investigates memory-managed long-context attention by separating a fast recurrent or sparse backbone from explicit editable request-local memory slots and query-time sparse fallback. The research aims to address the limitations of existing linear, recurrent, and sparse attention methods in managing when facts should be written, overwritten, protected, or discarded.

arxiv arXiv cs.CL · 5h ago

PASTA: A Paraphrasing And Self-Training Approach for Knowledge Updating in LLMs

This paper introduces PASTA, a framework designed to integrate detailed factual information from news articles into Large Language Models (LLMs) to address the challenge of knowledge updating. The approach combines data augmentation, question-answering generation, and a novel self-learning Direct Preference Optimization (DPO) process to enable knowledge overwriting and hallucination suppression.

arxiv arXiv cs.CL · 5h ago

MedEvoEval: Evaluating Continual Evolution of Doctor Agents through Simulated Clinical Episodes

The authors introduce MedEvoEval, an executable longitudinal evaluation framework designed to assess the continual evolution of doctor agents through simulated outpatient clinical episodes. This system moves beyond static benchmarks by tracking how agents acquire evidence, utilize resources, and refine their decision-making across multiple interactions.

arxiv arXiv cs.CL · 5h ago

Latent Bridges for Multi-Table Question Answering

The authors introduce GRAB, a constructor-encoder-bridge pipeline designed for table question answering that lifts relational data into a heterogeneous graph and encodes it via message passing. The method transfers signals to a frozen large language model through a small set of query-conditioned latent tokens, providing a compact structural representation while preserving the LLM's general reasoning capabilities.

arxiv arXiv cs.CL · 6h ago

FinInvest-GTCN: Explainable Graph-Temporal-Causal Modeling for Risk-Aware Investment Decision Optimization

Researchers introduce FinInvest-GTCN, a Graph-Temporal-Causal Network designed to optimize venture capital investment decisions by addressing challenges like heterogeneous data and non-stationary time series. The model redefines the task from content recommendation to quantitative risk-return assessment, utilizing a relational graph encoder, multi-scale temporal fusion, and a causal decision head to generate interpretable predictions.

arxiv arXiv cs.CL · 6h ago

EVLA: An Electro-Aware Multimodal Assistant for Physically-Grounded Driving Reasoning and Control

The authors introduce the Electro-Visual-Language Assistant (EVLA), a framework that integrates multi-modal scene understanding with real-time perception of an electrified powertrain's electro-mechanical state to improve driving decisions. This approach addresses the limitation of existing vision-language models that treat vehicle dynamics as a black box by incorporating physical constraints and optimization objectives.

arxiv arXiv cs.CL · 6h ago

A3M: Adaptive, Adversarial and Multi-Objective Learning for Strategic Bidding in Repeated Auctions

The A3M framework addresses the challenges of learning to bid in repeated multi-unit auctions by integrating adaptive deep reinforcement learning, adversarial reasoning, and multi-objective reward design. It utilizes an actor-critic backbone and opponent modeling to optimize strategy against non-stationary adversaries while balancing utility, revenue, and fairness.

arxiv arXiv cs.CL · 6h ago

Clustering Unsupervised Representations as Defense against Poisoning Attacks on Speech Commands Classification System

This paper proposes a filtering defense against dirty-label poisoning attacks on speech commands classification systems by clustering unsupervised representations to identify and remove poisoned training data.

arxiv arXiv cs.CL · 6h ago

Beyond the Mean: Three-Axis Fidelity for Aligning LLM-Based Survey Simulators from Small Pilot Data

This study investigates whether large language models can recover the statistical characteristics of a broader population using only a small pilot sample of human responses. The authors decompose this recovery into three axes: structural fidelity, marginal fidelity, and individual fidelity.

arxiv arXiv cs.CL · 6h ago

Can LLMs Hire Fairly? Racial Bias in Resume Screening

An audit of fourteen mainstream large language models reveals a significant shift in racial bias within resume screening algorithms over recent years. While 2023-vintage models reproduce pro-White callback gaps, all models released in 2024 or later show either null gaps or significant pro-Black reversals.

arxiv arXiv cs.CL · 6h ago

AgriTune-R: A Reproducible Framework for Fine-Tuning LLMs in Agriculture

The paper introduces AgriTune-R, a reproducible and auditable framework designed to adapt general-purpose large language models for specific agricultural applications. This approach addresses the domain-specific, safety-critical nature of agriculture by integrating data governance, expert evaluation, and evidence constraints to prevent unreliable advice.

arxiv arXiv cs.CL · 6h ago

BERTomelo: Your Portuguese Encoder Best Friend

This article introduces BERTomelo, a next-generation monolingual encoder specifically optimized for the Portuguese language using the ModernBERT architecture.

arxiv arXiv cs.CL · 6h ago

Conversational Domain Adaptation of IndicTrans2 via Experience Replay and Model Soups

The authors adapt the open-source IndicTrans2-1B translation system to handle conversational register across 21 Indic languages using only public datasets. By combining experience replay with model souping, they achieve significant improvements in automatic metrics without degrading performance on general domain tasks.

arxiv arXiv cs.CL · 6h ago

Clinical Evidence Strength Is Recoverable From LLM Representations, Not Stated Grades

A study of 22 open-weight large language models reveals that while the strength of clinical evidence can be recovered from model activations and text, the grades explicitly stated by the models are no better than chance. Researchers analyzed 45,134 clinical claims harmonized into four-level evidence grades to test whether models register and express evidence strength distinct from factual truth.

arxiv arXiv cs.CL · 7h ago

How to Leverage Synthetic Speech for LLM-Based ASR Systems?

Researchers investigate the distributional gap between synthetic and real speech in LLM-based automatic speech recognition (ASR) systems by probing a SLAM-ASR architecture. They identify that discriminative signals separating the two data types are concentrated in the early-to-middle layers of the model backbone.

arxiv arXiv cs.CL · 7h ago

Masked Diffusion Decoding as x-Prediction Flow

This paper introduces a continuous decoding framework for masked diffusion language models (MDLMs) that reinterprets mask prediction as clean-state prediction to induce a continuous flow in input embedding space. By allowing tokens to accumulate partial progress and remain revisable, the method addresses the premature commitments inherent in standard binary unmasking regimes.

arxiv arXiv cs.CL · 7h ago

ThinkProbe: Structural Profiling of LLM Reasoning via Non-Generative Thought Graphs

ThinkProbe is a framework for the structural analysis of large language model reasoning traces, converting them into directed Thought Graphs with eight node types and six edge types. It derives a 19-metric five-dimensional cognitive profile through a fully non-generative pipeline combining rule-based segmentation and discriminative semantic linking.

arxiv arXiv cs.CL · 7h ago

A Comparative Study on Affective Cues in Text Embeddings Across Psychological Emotion Theories

This study investigates the extent to which modern text encoders capture psychological theories of affect by evaluating twelve recently released models across three established emotion frameworks. The research compares word-level and sentence-level performance using both regression and classification tasks.

arxiv arXiv cs.CL · 7h ago

Low-cost concept-based localized explanations: How far can we get with training-free approaches?

This study evaluates whether mid-scale Multimodal Large Language Models (MLLMs) can perform localized concept naming under strict zero-shot conditions by assigning labels to bounding-box regions. The authors propose a reproducible evaluation protocol for Concept Naming that includes closed-set prompting and an embedding-similarity-based strategy for large label spaces.

arxiv arXiv cs.CL · 7h ago

Evolution Fine-Tuning: Learning to Discover Across 371 Optimization Tasks

Researchers introduce Evolution Fine-Tuning (EFT), a mid-training paradigm that teaches Large Language Models to evolve solutions across diverse tasks by converting evolutionary search trajectories into supervision. This approach addresses the limitation of prior methods that discard accumulated experience, enabling models to reuse discovery capabilities rather than solving new problems from scratch.