All articles
arxiv arXiv cs.CL · 4h ago

Revealing the Technology Development of Natural Language Processing: A Scientific Entity-Centric Perspective

This study analyzes the development of technologies in Natural Language Processing (NLP) from an entity-centric perspective, extracting methods, datasets, metrics, and tools to measure their impact via co-occurrence networks. The research reveals that while pre-trained language models like BERT and Transformer have become mainstream, the average number of entities per paper is increasing, indicating a growing knowledge burden for researchers.

arxiv arXiv cs.CL · 4h ago

KbSD: Knowledge Boundary aware Self-Distillation for Behavioral Calibration

The authors propose KbSD, a framework that addresses reward sparsity in agentic search by using dense token-level supervision and quadrant-adaptive optimization to calibrate when models should trust parametric memory versus retrieved evidence. This approach utilizes an information-asymmetric self-distillation process where a hint-augmented teacher generates calibrated reasoning demonstrations for a student model without requiring a larger external model.

arxiv arXiv cs.CL · 4h ago

ARKD: Adaptive Reinforcement Learning-Guided Bidirectional KL Divergence Distillation for Text Generation

The authors propose ARKD, a reinforcement-learning-based adaptive KL-weighted distillation framework that addresses the limitations of single KL objective methods in compressing Large Language Models. By using a policy network to dynamically assign weights to forward and reverse KL divergence based on teacher-student distributional characteristics, the method achieves dual alignment on principal and long-tail modes.

arxiv arXiv cs.CL · 5h ago

Clinical Reasoning Graphs: Structured Evaluation of LLM Diagnostic Reasoning Reveals Competence Without Consistency

This study introduces clinical reasoning graphs to evaluate the diagnostic reasoning patterns of large language models, revealing that while they achieve competence, they lack consistent reasoning schemas. The authors extracted structured graph representations from 750 traces across five LLMs and tested for stable reasoning patterns in clinically similar cases.

arxiv arXiv cs.CL · 5h ago

MemDelta: Controlled Baselines and Hidden Confounds in Agent Memory Evaluation

The article introduces MemDelta, a controlled evaluation protocol for agent memory systems that isolates individual components to prevent confounding variables from skewing results. Using the LongMemEval-S dataset with 500 questions across three model families, the study reveals that reported gains often mix changes in memory methods with variations in language models or retrieval pipelines.