OLMO 3.5, the new model in the OLMO series from AI2 (Allen Institute for AI), is coming soon. The main feature of this version is the hybrid architecture, which combines standard transformer attention layers with linear attention layers, leveraging Gated Deltanet technology.
Hybrid Architecture for Efficiency
The main goal of this hybrid architecture is to improve computational efficiency and reduce the memory footprint during inference, while maintaining high model quality. This is achieved by interleaving full attention layers with linear attention layers.
Open Source and Innovative Techniques
The OLMO series stands out for being completely open source, from the datasets used for training to the training recipes themselves. With OLMO 3.5, the team is experimenting with innovative techniques, including some introduced by Qwen3-Next, to further optimize memory usage, especially in tasks that require large contexts.
The OLMO series consists of Dense models, with the smallest having 1 billion parameters.
๐ฌ Commenti (0)
๐ Accedi o registrati per commentare gli articoli.
Nessun commento ancora. Sii il primo a commentare!