Z.ai has released its latest artificial intelligence model, GLM-5.1, under an MIT open-source license, allowing enterprises to download, modify, and use it for commercial applications.
The release marks a shift in focus from short, fast responses to long-duration autonomous task execution, with the model designed to operate on a single task for up to eight hours.
GLM-5.1 is a 754-billion-parameter Mixture-of-Experts model built to maintain goal alignment across extended workflows involving thousands of tool calls. The model can reportedly complete up to 1,700 steps in a task, compared to around 20 steps for earlier agent-based systems.
It features a large 202,752-token context window and is designed to reduce performance stagnation by using a “staircase” optimization pattern, where incremental improvements are followed by structural changes in strategy.
In a VectorDBBench test, the model executed 655 iterations and over 6,000 tool calls to optimize a vector database system.
Performance improved from 3,547 queries per second in earlier models to 21,500 queries per second after identifying multiple structural bottlenecks and applying changes such as cluster probing and pipeline optimization.
On KernelBench Level 3, GLM-5.1 achieved a 3.6x geometric mean speedup across 50 machine learning tasks, maintaining progress over extended runs.
The model also recorded strong benchmark scores, including 58.4 on SWE-Bench Pro, outperforming models such as GPT-5.4 and Claude Opus 4.6.
Additional results include 95.3 on the AIME 2026 math benchmark and 86.2 on GPQA-Diamond for scientific reasoning.
Z.ai has positioned GLM-5.1 as a tool for developers rather than a consumer chatbot. It is integrated into a subscription-based Coding Plan with three tiers.
The Lite plan costs $27 per quarter, the Pro plan is priced at $81 per quarter, and the Max plan costs $216 per quarter.
For API usage, pricing is set at $1.40 per million input tokens and $4.40 per million output tokens, with discounted rates available for cached inputs.
A separate model, GLM-5 Turbo, remains proprietary and is optimized for faster execution.
| Model | Input | Output | Total Cost |
| Grok 4.1 Fast | $0.20 | $0.50 | $0.70 |
| MiniMax M2.7 | $0.30 | $1.20 | $1.50 |
| Gemini 3 Flash | $0.50 | $3.00 | $3.50 |
| Kimi-K2.5 | $0.60 | $3.00 | $3.60 |
| MiMo-V2-Pro (≤256K) | $1.00 | $3.00 | $4.00 |
| GLM-5 | $1.00 | $3.20 | $4.20 |
| GLM-5-Turbo | $1.20 | $4.00 | $5.20 |
| GLM-5.1 | $1.40 | $4.40 | $5.80 |
| Claude Haiku 4.5 | $1.00 | $5.00 | $6.00 |
| Qwen3-Max | $1.20 | $6.00 | $7.20 |
| Gemini 3 Pro | $2.00 | $12.00 | $14.00 |
| GPT-5.2 | $1.75 | $14.00 | $15.75 |
| GPT-5.4 | $2.50 | $15.00 | $17.50 |
| Claude Sonnet 4.5 | $3.00 | $15.00 | $18.00 |
| Claude Opus 4.6 | $5.00 | $25.00 | $30.00 |
| GPT-5.4 Pro | $30.00 | $180.00 | $210.00 |
The release follows a hybrid strategy where GLM-5.1 is open source, while performance-optimized models remain closed.
Z.ai, which was listed on the Hong Kong Stock Exchange earlier this year with a market value of $52.83 billion, is using the release to expand its position in the global AI market.
The approach aligns with broader trends in the Chinese AI sector, where companies are separating open models from commercial offerings.
Early user feedback highlights improvements in reliability and task execution. Some developers reported completing projects in significantly less time, including reducing a week-long workflow to two days.
The model has also demonstrated the ability to build complex applications autonomously, including a Linux-style desktop environment within eight hours.
Z.ai notes that challenges remain, including maintaining coherence over extended workflows and improving self-evaluation mechanisms for tasks without clear performance metrics.
The model supports integration with developer tools such as Claude Code, OpenCode, Kilo Code, and others, enabling broader adoption in software development workflows.