Capability

Positional Embedding Strategies With Extrapolation Support

3 artifacts provide this capability.

Want a personalized recommendation?

Top Matches

via “alibi positional encoding for extrapolatable long-context attention”

fill-mask model by undefined. 35,60,259 downloads.

Unique: Combines ALiBi with Flash Attention and modern layer normalization (RMSNorm) to achieve length extrapolation without learned position embeddings, enabling zero-shot generalization to 4-8x longer sequences than training data

vs others: Outperforms RoPE (Rotary Position Embeddings) on length extrapolation benchmarks while maintaining lower memory overhead than interpolated positional embeddings used in LLaMA or GPT-3 variants

Positional Embedding Strategies With Extrapolation Support

Top Matches

Also Known As

Company