Rotary Position Embedding, or RoPE, is a type of position embedding which encodes absolute positional information with rotation matrix and naturally incorporates explicit relative position dependency in self-attention formulation. Notably, RoPE comes with valuable properties such as:
- the flexibility to be expanded to any sequence lengths
- decaying inter-token dependency with increasing relative distances
- the capability of equipping the linear self-attention with relative position encoding

References