MrRoPE: Context Extension for LLMs Without Fine-Tuning
A new research paper introduces MrRoPE (Mixed-radix Rotary Position Embedding), an innovative method for extending the context window of large language models (LLMs) without requiring fine-tuning. The technique is based on a generalized formulation that considers the different existing RoPE extension strategies as distinct number system conversion strategies.
Unified Approach and New Extensions
MrRoPE offers a unified theoretical approach for extending Rotary Position Embedding (RoPE), addressing the fragmentation of current strategies. The paper introduces two new training-free extensions, MrRoPE-Uni and MrRoPE-Pro, based on uniform and progressive conversion strategies. MrRoPE-Pro demonstrates remarkable 'train short, test long' generalization capabilities.
Performance and Benefits
In tests, MrRoPE-Pro maintained a recall of over 85% in the Needle-in-a-Haystack test with a 128K token context. Furthermore, it achieved more than double the accuracy of YaRN in the retrieval and dialogue subsets of Infinite-Bench. Theoretical analysis confirms that MrRoPE-Pro effectively raises the upper bound of RoPE's attainable encoding length.
For those evaluating on-premise deployments, there are trade-offs to consider carefully. AI-RADAR offers analytical frameworks on /llm-onpremise to evaluate these aspects.
๐ฌ Commenti (0)
๐ Accedi o registrati per commentare gli articoli.
Nessun commento ancora. Sii il primo a commentare!