Position Bias Correction is Insufficient for One-Pass Attention Sorting
The study investigates whether correcting position bias enables single-pass attention sorting to match the performance of iterative methods in long-context language models. Experiments on LLaMA-2 and YaRN-Llama-2 models refute the hypothesis that debiasing alone is sufficient to bridge the performance gap.