All articles — korshunov.ai

All articles Page 1 / 113

Triadic Werewolf: A Jester Role for Multi-Hop Theory of Mind in LLMs

Researchers extended the Werewolf game with a Jester role to create a triadic social-deduction environment that requires reasoning across three opposing utility functions, challenging large language models' theory-of-mind capabilities. Evaluations on GPT-4.1, DeepSeek-V3.1, and Llama-3.3-70B revealed that while the Jester won 60-70% of games, GPT-4.1 wolves frequently voted the Jester out on day 1 in 60-70% of cases, a self-defeating action driven by language priors.

arxiv arXiv cs.CL · 4h ago

Verifiable Geometry Problem Solving: Solver-Driven Autoformalization and Theorem Proposing

Researchers propose SD-GPS, a solver-driven framework for geometry problem solving that addresses bottlenecks in autoformalization and theorem prediction by treating the symbolic solver as an execution oracle. This approach unifies supervised formal-language adaptation with solvability-guided reinforcement learning to ensure executability during formalization.

arxiv arXiv cs.CL · 4h ago

VASAE: Naming SAE Dictionary Directions with Vocabulary-Aligned Anchoring

The authors introduce Vocabulary-Aligned Sparse Autoencoder (VASAE), a method that trains sparse autoencoder features using vocabulary-aligned anchoring to assign each feature an intrinsic token name based on the nearest embedding in the Transformer's vocabulary.

arxiv arXiv cs.CL · 4h ago

AI Persuasive Framing in Collective Dilemmas

A study involving 1,283 participants tested whether AI assistants could enhance cooperation in iterated Collective Risk Games through personalized persuasive framing based on Social Value Orientation profiles. The research found that while pro-social nudges significantly increased contributions and group success rates, these effects were short-lived.

arxiv arXiv cs.CL · 5h ago

An Empirical Analysis of Factual Errors in Human-Written Text and its Application

This study addresses the neglect of factual error detection in human-written text by distilling a taxonomy of errors from newspaper article corrections, revealing categories like kanji misconversions that are absent in current hallucination benchmarks. The authors evaluate vanilla large language models on synthesized test cases and real corrections to assess their performance on this specific task.

arxiv arXiv cs.CL · 5h ago

Multi-Stage Explainable Framework for Speech-Based Cognitive Impairment Detection

Researchers propose a multi-stage explainability framework that translates black-box transformer predictions into clinically grounded narratives for speech-based cognitive impairment detection. The system integrates SHAP-based token attribution, linguistic features, and an LLM reasoning pipeline to map model outputs to specific cognitive-linguistic dimensions.

arxiv arXiv cs.CL · 5h ago

ToxiREX: A Dataset on Toxic REasoning in ConteXt

Researchers introduce ToxiREX, a new multilingual dataset designed to capture and explain implicit, context-dependent toxicity within Reddit comment threads. The dataset utilizes a systematic toxic reasoning schema to provide structured annotations for comments related to major global events across six languages.

arxiv arXiv cs.CL · 5h ago

Dialogue to Detection: A Multimodal Hybrid NLP Pipeline for Insurance Fraud Detection

This article introduces a synthetic multimodal framework designed to replicate First Notice of Loss (FNOL) conditions for insurance fraud detection, addressing the limitations of existing text-only approaches. The system generates agent-customer dialogue transcripts and two-speaker audios to integrate linguistic, behavioral, and speaker-based indicators.

arxiv arXiv cs.CL · 5h ago

The Signal-Coverage Matrix: Stratifying Type and Semantic Errors in Statement Autoformalization

This article introduces a signal-coverage matrix to stratify type and semantic errors in LLM autoformalization, moving beyond scalar type-correctness metrics. The framework categorizes outputs into true success, type-only, semantic-only, or both fail cells by crossing Lean elaborator results with semantic equivalence judgments.

arxiv arXiv cs.CL · 5h ago

Tree-of-Thoughts Hybrid Approach for Legal Case Judgement Summarization

This study proposes a novel tree-of-thoughts inspired extractive-abstractive summarization approach for legal case judgements, addressing the limited exploration of hybrid techniques in prior work. Experiments comparing DeepSeek and LLaMA models demonstrate that this proposed method yields superior summaries compared to traditional extractive or abstractive prompts.

arxiv arXiv cs.CL · 5h ago

DG^VoiC: Speaker Clustering for Fraud Investigation under Real Call-Centre Conditions

This paper introduces DG^VoiC, a voice clustering framework designed to identify repeated speakers in anonymized real call-center audio to assist in fraud investigation. The method combines sensitive information-aligned anonymization, speech-focused preprocessing, sliding-window speaker embedding extraction, and cosine similarity-based clustering.

arxiv arXiv cs.CL · 5h ago

LLMs Judge Worse Than They Generate in In-Context QA

A study challenges the assumption that large language models evaluate their own outputs better than they generate them, finding that generation accuracy exceeds self-evaluation on three of four tested benchmarks. The research utilizes a controlled in-context QA setting to isolate evaluation performance from parametric knowledge confounds.

arxiv arXiv cs.CL · 5h ago

MultiHashFormer: Hash-based Generative Language Models

The paper introduces MultiHashFormer, a framework enabling hash-based autoregression in causal language models by representing tokens as unique signatures of discrete hash IDs. This approach allows the model to compress token information into latent vectors for Transformer processing while mapping them back to text, effectively addressing the many-to-one collision issues that previously prevented hashing in generative contexts.

arxiv arXiv cs.CL · 5h ago

Single and Multi Truth Data Fusion using Large Language Models

This paper investigates the use of Large Language Models (LLMs) for data fusion tasks involving tabular data, covering both single-truth and multi-truth scenarios. The study evaluates various prompting strategies across three benchmark datasets to determine their effectiveness in resolving conflicting values from multiple sources.

arxiv arXiv cs.CL · 6h ago

Scaling limit of the Random Language Model

This article develops a quantitative theory for the Random Language Model (RLM) in a scaling limit where the number of hidden symbols approaches infinity while the grammar temperature approaches zero at a fixed ratio. The study establishes that the model admits a controlled description based on a large-deviation principle over rule-usage patterns, mapping the problem to Random Energy Models with nontrivial combinatorics.

arxiv arXiv cs.CL · 6h ago

Mechanism-Driven Monitors for Preemptive Detection of LLM Training Instability

This article introduces mechanism-driven monitors designed to detect large language model training instability before it causes significant damage. By deriving internal signals from the functional roles of critical modules, these monitors identify failures thousands of steps earlier than traditional loss-based methods.

arxiv arXiv cs.CL · 6h ago

From Tokens to States: LLMs as a Special Case of World Models

The article challenges the dichotomy between large language models and world models by arguing that LLMs are actually a degenerate special case of world models rather than a replacement. It posits that there is a continuous spectrum from next-token prediction to latent-space architectures, with current research already occupying intermediate positions.

arxiv arXiv cs.CL · 6h ago

Epi2Diff: Using LLM Reasoning Traces to Predict Human Item Difficulty

Researchers introduce Epi2Diff, a framework that maps Large Reasoning Model (LRM) traces into cognitively grounded episode sequences to predict human item difficulty in educational assessment. By modeling difficulty through reasoning scale, effort allocation, and state transitions, the method provides an interpretable alternative to costly human calibration.

arxiv arXiv cs.CL · 6h ago

HPRO: Hierarchical Progressive Reward Optimization for Emotional TTS

The authors propose HPRO, a hierarchical progressive reward optimization framework designed to enhance emotional expressiveness in LLM-based Text-to-Speech models while preserving linguistic intelligibility. This approach addresses structural mismatches in existing preference-driven methods by isolating content and emotion and bridging the gap between sparse rewards and dense generation.

arxiv arXiv cs.CL · 6h ago

Vision-Default, Prior-Override: Causal Mechanisms of Perception-Knowledge Conflict in Vision-Language Models

This study investigates how vision-language models resolve conflicts between visual evidence and memorized world knowledge by combining activation patching with mechanistic analysis across three model families. The research identifies a sparse causal circuit where visual grounding is the default, while overriding it with prior knowledge requires specific attention heads.