All articles — korshunov.ai

All articles Page 1 / 104

The Geometry of Updates: Fisher Alignment at Vocabulary Scale

This article addresses the challenge of training-free source selection for large language models with shared vocabularies in scientific domains like SMILES and genomics, where classical metrics are either uninformative or computationally prohibitive. The authors demonstrate that representation similarity metrics are non-identifiable for transfer because models can share identical representations yet have orthogonal head updates.

arxiv arXiv cs.CL · 5h ago

How Surprising Is Historical Italian to Language Models? Tokenization Tax, Comprehension Tax, and a Simple Mitigation

This paper proposes a diagnostic framework decomposing historical language difficulty into tokenization cost, predictive uncertainty, semantic robustness, and context sensitivity. The authors evaluate this framework on 17th-century Italian, 19th-century Italian, and 18th-century Russian texts to understand how LLMs process historical languages.

arxiv arXiv cs.CL · 5h ago

Multilingual Reasoning Cascades Need More Context

Translation cascades for reasoning translate queries to English, reason, and translate back, but this process is structurally lossy due to information discard at each stage. The authors propose a context-aware translation cascade that preserves the original question, translated query, and reasoning trace to mitigate these losses.

arxiv arXiv cs.CL · 5h ago

Beyond Surface Forms: A Comprehensive, Mechanism-Oriented Taxonomy of Indirect Linguistic Encoding for LLM-Based Coded Language Detection

Researchers propose a mechanism-oriented taxonomy of indirect linguistic expressions (ILE) to categorize the underlying operations used to encode and recover meaning in coded language. This approach abstracts away from communicative goals to focus on the specific encoding mechanisms found in algospeak, euphemisms, and adversarial obfuscation.

arxiv arXiv cs.CL · 5h ago

LLM-Based Examination of Eligibility Criteria from Securities Prospectuses at the German Central Bank

This paper presents the first case study applying Large Language Models to the German Central Bank's process of verifying securities eligibility for collateral, shifting from traditional Named Entity Recognition to a generative Information Extraction pipeline. The approach decomposes the task into extraction, normalization, and interpretation to handle noisy text and bilingual content more effectively.

arxiv arXiv cs.CL · 5h ago

Empowering GUI Agents via Autonomous Experience Exploration and Hindsight Experience Utilization

Researchers introduce the Planning Experience Exploration and Utilization (PEEU) method to enhance task planning in multimodal web agents using small open-source Multimodal Large Language Models (MLLMs). This approach autonomously explores environments to discover experiences and synthesizes high-level training data through hindsight experience utilization.

arxiv arXiv cs.CL · 5h ago

Assessing Post-Reform Changes in Risk Disclosure Quality with a Multidimensional Text Analysis Approach

This study proposes a longitudinal text analysis framework combining Japanese-language NLP metric extraction with paired testing and shift function analysis to evaluate qualitative changes in corporate risk disclosures. Applied to Japan's 2019 disclosure reforms, the approach analyzes 19,770 firm-year observations over ten years to capture multidimensional dynamics often masked by single-indicator methods.

arxiv arXiv cs.CL · 6h ago

Mapping Political-Elite Networks in Europe with a Multilingual Joint Entity-Relation Extraction Pipeline

Researchers present a modular, fully open-weight pipeline for multilingual joint entity-relation extraction that builds signed, temporal knowledge graphs from massive unstructured news corpora. The system combines span-based named-entity recognition with a linking cascade to Wikidata and an ontology-constrained mixture-of-experts model to extract directed relationships.

arxiv arXiv cs.CL · 6h ago

DanceOPD: On-Policy Generative Field Distillation

The authors introduce DanceOPD, an on-policy generative field distillation framework designed to unify text-to-image generation with local and global editing capabilities in flow-matching models. This approach routes samples to specific capability fields and trains using a velocity MSE objective to compose expert skills without mutual interference.

media r/LocalLLaMA · 6h ago

Request for Good YouTube Channels for Local LLM News

A Reddit user is seeking recommendations for YouTube channels that provide news and updates on local large language model development.

media r/LocalLLaMA · 6h ago

When you don't have a data center GPU

The article references the LiquidAI LFM2.5-230M model as an alternative for users without access to data center GPUs.

media r/LocalLLaMA · 6h ago

Ornith-1.0: Open-source LLMs for agentic coding

Ornith-1.0 is a new family of open-source large language models specialized for agentic coding tasks. The model family spans multiple parameter sizes, including 9B Dense, 35B MoE, and 397B MoE configurations.

arxiv arXiv cs.CL · 6h ago

Nemotron-TwoTower: Diffusion Language Modeling with Pretrained Autoregressive Context

NVIDIA introduces Nemotron-TwoTower, a diffusion language model that decouples context representation and iterative denoising into two separate networks to overcome capacity limitations in existing approaches. Built on the open-weight Nemotron-3-Nano-30B-A3B model and trained on 2.1T tokens, it retains 98.7% of the autoregressive baseline's quality while achieving 2.42X higher wall-clock generation throughput.

arxiv arXiv cs.CL · 6h ago

Humans Disengage, Reasoning Models Persist: Separating Difficulty Registration from Deliberation Allocation

A study reveals that while large reasoning models (LRMs) and humans both spend more time on harder problems, they diverge significantly in how they allocate deliberation within specific items. When making errors, LRMs generate more tokens than when correct, whereas humans do the opposite, spending less time on trials they get wrong.

arxiv arXiv cs.CL · 6h ago

MemStrata: Eliminating Stale-Fact Errors in RAG Agents via Temporal Validity

The article introduces MemStrata, a retrieval memory system designed to eliminate stale-fact errors in AI agents by maintaining temporal validity within accumulated knowledge. Unlike standard Retrieval-Augmented Generation (RAG), which struggles to distinguish between duplicated and contradicted facts due to embedding similarity, MemStrata uses a deterministic supersession rule to retire outdated information.

arxiv arXiv cs.CL · 6h ago

Erase-then-Delta Attention: Decoupling Erase and Write Addresses in Delta-Rule Linear Attention

The authors propose Erase-then-Delta Attention (EDA), a memory update rule for recurrent models that decouples the address used to erase stale information from the address used to write new content. This approach addresses the limitation of delta-rule linear attention, which cannot actively remove outdated data stored at different locations before writing.

arxiv arXiv cs.CL · 7h ago

The Inattentional Gap: Task-Conditioned Models Omit Safety Signals

A study reveals that conditioning language and vision models on narrow tasks suppresses their ability to report co-present, safety-critical signals they can otherwise detect. This phenomenon, termed the "Inattentional Gap," demonstrates a dissociation between measured benchmark safety and real-world safety.

arxiv arXiv cs.CL · 7h ago

DiARC: Distinguishing Positive and Negative Samples Helps Improving ARC-like Reasoning Ability of Large Language Models

The paper introduces DiARC, a method that improves the abstract reasoning capabilities of large language models by incorporating negative sample supervision alongside positive examples. This approach addresses the limitations of current methods that rely heavily on data augmentation or expensive closed-source models.

arxiv arXiv cs.CL · 7h ago

Compiler-Driven Approximation Tuning for Hyperdimensional Computing

The authors introduce ApproxHDC, a framework that automates the identification and application of domain-specific approximations in Hyperdimensional Computing (HDC) workloads. This system extends the HPVM-HDC compiler infrastructure to enable retargetable compilation across diverse hardware backends, including CPUs, GPUs, and simulated ReRAM and PCM accelerators.

arxiv arXiv cs.CL · 7h ago

Adversarial Diffusion Across Modalities: A Fusion Survey of Attacks, Defenses, and Evaluation

This survey integrates four disconnected tracks of adversarial evaluation—diffusion-based attacks on text and LLMs, image classifiers, vision-language models, and input purification defenses—into a single conceptual framework. It focuses on the LLM-side slice to unify vocabulary, threat models, and benchmarks around denoising diffusion as a shared generative mechanism.