A comprehensive study reveals that sinusoidal positional encodings preserve functional equivalence in Transformers, while rotary positional encodings reduce symmetry, enhancing expressivity. The research shows that positional encodings critically influence linear mode connectivity, with empirical results demonstrating variability in connectivity depending on the encoding used.
Functional Equivalence in Attention with Positional Encodings
from English