๐ Hardware
AI generated
DeepSeek proposes a workaround to train bigger AI models with less powerful chips
Chinese AI startup DeepSeek has unveiled a new approach to building larger, more capable AI models without needing the most advanced computer chips that US export controls have restricted.
## Engram: A Solution to Optimize Memory Usage
The technique, detailed in a technical paper by DeepSeek founder Liang Wenfeng and researchers from Peking University, tackles a fundamental problem: AI models are getting so large that theyโre bumping up against the memory limits of even the best GPUs. DeepSeekโs solution, called "Engram," creates a more efficient filing system that lets the AI store basic facts separately from complex calculations โ freeing up precious computing power for the harder thinking tasks.
## Why Chip Memory Matters
Modern AI models need to access vast amounts of information quickly during training and when responding to queries. That requires high-bandwidth memory (HBM) โ specialized, fast-access memory built into advanced GPUs. China faces a significant disadvantage here, with Chinaโs leading memory chip manufacturer ChangXin Memory Technologies remaining several years behind industry leaders like Samsung, SK Hynix, and Micron.
## How the Breakthrough Works
Traditional AI models handle everything through computation โ even retrieving simple information. Engram changes this by letting models "look up" foundational facts more efficiently, similar to how humans might consult a reference book. The technique also helps AI handle longer inputs, which remains a major obstacle for deploying AI chatbots as practical assistants.
## Results and Prospects
Testing on a 27 billion parameter model showed performance improvements and more computing capacity for complex tasks. Elie Bakouch, a research engineer at Hugging Face, praised the technique for its practical implementation. DeepSeek is expected to release a V4 model with enhanced coding capabilities, coinciding with the first anniversary of its R1 model. The technical paper is expected to receive scrutiny from AI researchers, as DeepSeek has emerged as a prominent example of Chinese AI innovation operating under US export restrictions on advanced semiconductors.
๐ฌ Commenti (0)
๐ Accedi o registrati per commentare gli articoli.
Nessun commento ancora. Sii il primo a commentare!