Reasoning models — korshunov.ai

Reasoning models Page 1 / 35

Cross-Lingual Exploration for Parametric Knowledge

Cross-lingual prompting strategies improve factual knowledge retrieval across 17 diverse languages. The approach outperforms native-language scaling in compute efficiency and enhances cross-lingual consistency beyond accuracy gains.

arxiv arXiv cs.CL · 1d ago

Qwen-AgentWorld: Language World Models for General Agents

Qwen-AgentWorld-35B-A3B and Qwen-AgentWorld-397B-A17B are the first language world models that simulate agentic environments across seven domains using long chain-of-thought reasoning. Trained via a three-stage pipeline—CPT, SFT, and RL—these models outperform existing frontier models on AgentWorldBench, a benchmark derived from real-world interactions of five models on nine established tasks.

arxiv arXiv cs.CL · 1d ago

Cross-Lingual Proverb Studies Reveal Cultural Meaning Preservation in LLMs

A study evaluates how large language models preserve cultural meaning when generating narratives from equivalent proverbs across 15 languages. Results show semantic consistency in moral lessons, with systematic shifts in narrative agency and structure, and strong convergence across model families. The research highlights that current evaluations may overestimate cultural preservation by focusing only on semantic similarity.

arxiv arXiv cs.CL · 1d ago

Privacy-Preserving RAG via Multi-Agent Semantic Rewriting

A multi-agent framework sanitizes retrieved content by removing sensitive identifiers through semantic rewriting, reducing privacy leakage in targeted attacks. It maintains strong contextual fidelity with a BLEU-1 score of 0.122, outperforming SAGE's 0.117, and operates as an asynchronous preprocessing step with no added latency to online inference.

arxiv arXiv cs.LG · 1d ago

Memory-Efficient Graph Filtering for Scalable Collaborative Filtering

Mem-GF introduces a memory-efficient graph filtering method that approximates polynomial graph filters using Krylov subspaces, eliminating the need to store the full item similarity graph. It achieves up to 5.74× lower memory usage and 4.38× faster runtime while maintaining superior recommendation accuracy compared to state-of-the-art methods, scaling effectively to datasets with tens of millions of interactions.

arxiv arXiv cs.LG · 1d ago

Distilling Transformers into Recurrent Transformers for Efficient Memory

A new distillation method transfers the observation compression strategy of full-history transformers to recurrent models. By training a teacher model to compress observation histories into fixed-size bottlenecks, the approach aligns the student's memory with the teacher's compression. This enables recurrent transformers to achieve near-full-history performance with linear-time complexity, making them viable for long-horizon robotics applications.

arxiv arXiv cs.LG · 1d ago

LIG: Layer-wise Integrated Gradients for Transformer Flow Analysis

LIG extends Integrated Gradients to set-to-set maps in Transformers, enabling token-level attribution within layers. It analyzes module-wise and layer-wide attribution consistency and tracks information flow via separate attention and MLP contributions, using target token embedding and zero or zero-attention outputs as baselines. LIG operates at module boundaries without retraining or custom interpreters, offering a diagnostic XAI tool for Transformer internals.

arxiv arXiv cs.LG · 1d ago

Cost Geometry of Belief in Noisy Inference

A finite-machine inference model uses cost geometry to quantify belief transitions, combining optimal transport with Fisher information. The framework reveals a wall, honesty, and rigidity in belief spaces, with the Gaussian belief achieving maximal hyperbolic curvature. Thermodynamics sets the cost unit, and the geometric floor of precision diverges at certainty, with the value -1/4 representing a key scale.

arxiv arXiv cs.AI · 1d ago

Profile-Based Reference in LLM Grounding

The paper argues that reference in large language models is not a fixed link but a profile-based, context-sensitive, and numerically structured phenomenon. It proposes that LLMs ground reference through linguistic traces parameterized via optimization, with referential profiles distributed and activated through context-sensitive computation, supported by mechanistic interpretability findings.

arxiv arXiv cs.AI · 1d ago

Linguistic Distance Affects Consensus in Neural Cellular Automata

A study on neural cellular automata shows that linguistic distance slows consensus and induces mild group divergence without full fragmentation. A collective trained under diverse communication protocols remains robust to mismatch, unlike one trained uniformly, and these results are consistent across ring and 2D grid structures, with parallels to human group dynamics.

arxiv arXiv cs.AI · 1d ago

Coherence Illusions in Dutch LLMs Revealed

Dutch language models exhibit coherence illusions similar to human readers. Surprisal and attention entropy metrics show that models are misled by context matches, with energy from associative memory identifying discourse coherence mechanisms.

arxiv arXiv cs.AI · 1d ago

ARCO: Adaptive Rubric with Co-Evolution for Multi-Step LLM Agents

ARCO introduces a rubric framework that enables step-level credit assignment for multi-step LLM agents. It jointly updates a shared model with generation and scoring heads, allowing the rubric content and scoring function to co-evolve via on-policy data, improving performance and interpretability across benchmarks.

arxiv arXiv cs.AI · 1d ago

FastGAN and Transformer Models Improve Aphid Detection in Faba Beans

A study uses FastGAN to generate 10,000 synthetic hyperspectral images of faba bean leaves, preserving real spectral and structural features. Transformer-based models, particularly Vision Transformer, achieve the highest accuracy and F1-scores in classifying healthy versus aphid-infested leaves, outperforming classical CNNs and demonstrating improved disease detection with reduced false negatives.

arxiv arXiv cs.AI · 1d ago

Topological Neural Dynamics: Neuron-wise Sequence Modeling

Topological Neural Dynamics (TND) introduces a neuron-wise framework for sequence modeling, where each neuron evolves independently through a directed graph structure. In a single-player Pong behavior cloning task, TND achieves a mean of 17.47 consecutive catches per round, surpassing all baseline models by more than three times.

arxiv arXiv cs.AI · 1d ago

NASDAQ: Normalized Observation Space Dynamics-Augmented Q-Learning

NASDAQ addresses low-dimensional observation challenges in reinforcement learning by normalizing observation spaces to balance reconstruction losses across dimensions. The framework combines value learning with short-term value and next observation prediction, achieving competitive or superior performance with less training time compared to existing methods.

arxiv arXiv cs.AI · 1d ago

Influence-Based Explanations for Dysarthria Severity Assessment

A new framework provides instance-level explanations for dysarthria severity assessment by identifying supportive and competing training samples. Using gradient-based influence scores, it links model decisions to perceptible reference cases, enabling auditable and interpretable predictions through controlled deletion experiments.

arxiv arXiv cs.AI · 1d ago

TASER: Task-Differentiated Skill Expansion for Heterogeneous Continual Learning

TASER introduces a framework that dynamically expands and routes atomic skills for continual learning across highly heterogeneous tasks. It reduces catastrophic forgetting and improves plasticity by ensuring semantic distinctness and efficient capacity allocation through skill detection and routing mechanisms. Evaluated on HeteroCLBench, a benchmark of 19 diverse tasks across 9 cognitive dimensions, TASER outperforms existing baselines.

arxiv arXiv cs.AI · 1d ago

Social World Model for Lifelong Social Intelligence

The Social World Model decomposes social interaction into five dimensions to enable closed-loop learning. It allows open-source models to sustainably improve and retain social capabilities, outperforming baselines and matching closed-source Gemini 3 Flash in key metrics without forgetting across difficulty levels.

arxiv arXiv cs.AI · 1d ago

Ramanujan Graph Rewiring Alleviates GNN Over-Squashing

Ramanujan Propagation uses Ramanujan graphs to reduce over-squashing in Graph Neural Networks by ensuring non-negative resistance curvature. The method preserves local connectivity while enabling efficient long-range information flow, outperforming nine state-of-the-art rewiring techniques.

arxiv arXiv cs.AI · 1d ago

Transformer Models Highly Sensitive to Noisy Data in Trajectory Prediction

A study finds that Transformer-based trajectory prediction models degrade significantly with noisy object state data. Accuracy drops by 1.3x under mild noise and up to 3.9x under realistic high noise conditions, highlighting their sensitivity and the need for noisier, real-world training data and mitigation strategies.