Introduction

Gamayun is a new multilingual LLM model that has recently gained attention for its ability to surpass competitors with an innovative pre-training strategy.

Technical Characteristics

The Gamayun model was trained on a total of 2.5 T tokens and supports 12 languages, with a particular focus on the Russian language.

Achieved Results

Despite having a smaller training budget than its competitors, Gamayun has achieved impressive results on all considered benchmarks and has surpassed the Qwen2.5-1.5B model in a wide range of English and multilingual tasks.

Implications

The pre-training strategy employed by Gamayun offers new possibilities for adapting LLM models to environments with limited resources, making them more accessible to a wider audience.