Theoretical Analysis of Positional Encodings in Transformer Models

Kinda disappointing that rope- the most common pe- is given about one sentence in this work and omitted from the analysis.

gsf_emergency_2 3 hours ago

Maybe it's because ropes by themselves do nothing for the model capacity?