Skip to main content

Position encoding: Absolute vs Relative vs Rotary Embeddings vs Alibi

The basic feature is absolute position encoding stemming from the original Transformer Paper. However, even with this, we can use SinusoidalInterleaved (default OpenNMT-py) or SinusoidalConcat (default Fairseq imported models)

  • position_encoding_type: 'SinusoidalInterleaved' Do not forget to set also param_init_glorot: true

If you prefer to use relative position encoding, we support 3 modes:

In a nutshell, at the time if this writing (v3.1) absolute position encoding is managed in the Embeddings module, whereas the relative position encoding is managed directly in the multi-head self-attention module.