All articles
arxiv arXiv cs.CL · 5h ago

HyperDFlash: MHC-Aligned Block Speculative Decoding with Gated Residual Reduction

HyperDFlash is a block-parallel speculative decoding framework designed to address feature misalignment issues when adapting DFlash to DeepSeek-V4's multi-hyper-connection (MHC) architecture. The authors propose two key optimizations: using pre-collapse residual states for conditioning and replacing the generic linear compressor with a lightweight gated residual reducer inherited from the model's hyper-connection head.

arxiv arXiv cs.CL · 6h ago

Computational Study of Lexical Transmission Across Bengali Devotional Traditions

A computational corpus study analyzes vocabulary relationships across eight layers of Bengali and Sanskrit devotional literature from the 8th to 19th centuries, quantifying the historical claim that Buddhist Vajrayana vocabulary was absorbed into the Shakta Tantra tradition. Using TF-IDF character n-gram vectorization on 75 texts, the research provides the first quantitative corroboration of this lexical transmission chain.

arxiv arXiv cs.CL · 6h ago

Cascaded Multi-Granularity Pruning for On-Device LLM Inference in Industrial IoT

This article introduces a cascaded multi-granularity pruning framework designed to deploy large language models on Industrial Internet of Things (IIoT) edge devices by removing layers, attention heads, and feed-forward channels in a coarse-to-fine order. The method utilizes lightweight low-rank recovery between stages to re-estimate component importance, addressing the collapse of existing structured pruning methods at high compression ratios.

arxiv arXiv cs.CL · 6h ago

Heterogeneous Neural Predictivity from Language Models During Naturalistic Comprehension

This study demonstrates that frozen language models can serve as effective neural predictors for brain activity during natural speech and text comprehension, while distinguishing predictive utility from claims about shared neural organization. The analysis of MEG and ECoG data revealed widespread positive prediction gains over low-level baselines, though participant-level advantages were localized rather than uniform.