Zhipu AI has announced the availability of GLM-5.1, its latest flagship model, accessible through the paid "Coding Plan" subscriptions.

Performance and Capabilities

According to the data provided, GLM-5.1 achieves high performance in coding benchmarks:

  • SWE-bench-Verified: 77.8 points, the highest score among open-source models.
  • Terminal Bench 2.0: 56.2 points, another SOTA (state-of-the-art) result for open-source models.
  • Ability to compete with GPT-4o and approach Claude Opus 4.5 in coding tasks.
  • Context window of 200,000 tokens, with a maximum output capacity of 128,000 tokens.
  • 744 billion parameters (40 billion activated) and pre-training on 28.5T of data.

Practical Applications

GLM-5.1 is designed to tackle complex tasks such as:

  • Autonomous multi-step coding with minimal manual intervention.
  • Refactoring and debugging of large codebases.
  • Agentic workflows: plan, execute, debug, and deliver.

For those evaluating on-premise deployments, there are trade-offs to consider. AI-RADAR offers analytical frameworks on /llm-onpremise to support these evaluations.