Z.ai has released GLM-4.7-Flash, a 30 billion parameter MoE (Mixture of Experts) reasoning model designed specifically for local inference. ## Key Features * **Performance:** Optimized for coding, agentic workflows, and chat, delivering best-in-class performance. * **Efficiency:** Uses approximately 3.6 billion active parameters. * **Extended Context:** Supports context windows up to 200,000 tokens. * **Benchmarks:** Excellent results in SWE-Bench and GPQA benchmarks, as well as reasoning and chat tests. The official guide for using and fine-tuning GLM-4.7-Flash is available on Unsloth.ai.