arxiv arXiv cs.CL · 8d ago · research

VoidPadding: Decoupling [EOS] Termination and Padding in MDLMs

from English

VoidPadding introduces [VOID] as a padding token to separate semantic termination and response-length modeling. It improves performance on mathematical reasoning and code generation by 17.84 points over the original model and reduces decoding NFE by 55.7% on average.

Importance 2/3 arXiv cs.CL Code generation Evaluation & benchmarks Reasoning models

Read original