GLM-4.7: Built for real development workflows
Z.ai has released GLM-4.7, a new update to its GLM family aimed squarely at production-grade engineering. The model is tuned for multi-step tasks, frequent tool calls, and long-running sessions where consistency matters. It ranks #6 on WebDev and is the top-ranked open model on that leaderboard, positioning the company as what some call "China's OpenAI."
If you build with agents, CI-integrated copilots, or terminal-driven automation, this release is worth a look. The focus is clear: fewer retries, fewer prompt tweaks, and a more predictable loop from plan to execution.
What changed vs. GLM-4.6
GLM-4.7 moves past GLM-4.6 with better support for coding workflows, deeper reasoning for complex tasks, and steadier tool interaction. The model keeps a stable thread across long task chains, which helps cut down on drift as your sessions grow.
It also improves conversational quality and writing flow, which shows up in code reviews, documentation, and role-based interactions. In short: better output, less babysitting.
Why this matters for engineers
- Handles longer task cycles with fewer resets or prompt edits.
- Plays well with tool-heavy agent stacks and terminal workflows.
- Supports "think-then-act" patterns in frameworks like Claude Code, Cline, Roo Code, TRAE, and Kilo Code.
- More consistent reasoning across multiple interactions.
Benchmarks and tool use
On BrowseComp, GLM-4.7 scores 67.5. On τ²-Bench, which measures interactive tool use, it posts 87.4 - the highest result reported so far among publicly available open-source models.
Across SWE-bench Verified, LiveCodeBench v6, and Terminal Bench 2.0, performance lands at or above the level of Claude Sonnet 4.5, with clear gains over GLM-4.6. On Code Arena's large-scale blind evaluations, GLM-4.7 holds the top spot among open-source models and leads models built in China.
For context on one of the core evaluations, see SWE-bench Verified on Papers With Code: benchmark overview. If you deploy on modern edge or full-stack platforms, note ecosystem fit with Vercel as one of several supported integrations.
Reasoning you can steer
GLM-4.7 gives finer control over the depth and shape of its reasoning. It can keep its chain of thought consistent across multiple turns while scaling up or down based on task complexity.
This makes agent behavior more predictable over time, especially when tasks branch or when tool feedback loops are noisy. If you measure stability in production, this is the kind of knob you want.
Front-end generation gets cleaner
For frontend tasks, GLM-4.7 shows a stronger grasp of layout patterns: spacing, hierarchy, and style cohesion are more consistent. Web pages, dashboards, and slides need fewer manual fixes.
This won't replace design, but it will shorten the feedback loop and reduce nitpicks that stall delivery.
Real-world evaluation
Z.ai ran 100 real programming tasks inside a Claude Code-based setup, spanning frontend, backend, and instruction-following. The company reports higher task completion and steadier behavior vs. GLM-4.6.
Based on these results, GLM-4.7 is now the default model for the GLM Coding Plan. The headline: fewer retries, faster handoffs, and simpler rollout.
Ecosystem and availability
GLM-4.7 ships via the BigModel.cn API and sits inside Z.ai's full-stack development environment. Adoption spans developer tools, infrastructure, and app platforms - including TRAE, Cerebras, YouWare, Vercel, OpenRouter, and CodeBuddy.
- Default model for the GLM Coding Plan: z.ai/subscribe
- Try the model: chat.z.ai
- Model weights: Hugging Face
- Technical post: z.ai/blog/glm-4.7
IPO plans and growth
Z.ai plans to list on the Stock Exchange of Hong Kong, aiming to be the first publicly listed company with a core business centered on AGI foundation models. Reported revenues: 57.4M RMB in 2022, 124.5M RMB in 2023, and 312.4M RMB in 2024, with a 130% CAGR over that period.
H1 2025 revenue reached 190M RMB, marking the third straight year of doubling year-over-year. The company says large-model products drove the growth.
Quick start checklist
- Run GLM-4.7 inside your preferred agent IDE (Claude Code, Cline, Roo Code, TRAE, or Kilo Code).
- Enable think-then-act patterns and log tool calls for traceability.
- Test against a fixed set of long, multi-step tasks to measure drift and retries.
- Compare against your current model on rate limits, tool latency, and handoff reliability.
About Z.ai
Founded in 2019 as a spin-out from Tsinghua University research, Z.ai built the GLM (General Language Model) pre-training architecture and a full-stack portfolio spanning language, code, multimodality, and agents. Its models run on more than 40 domestically produced chips, and the company states its roadmap tracks with global top-tier standards.
Further learning
- Hands-on AI Certification for Coding - structured projects to integrate LLMs into dev workflows.
- AI courses by leading providers - compare options based on your stack and role.
Your membership also unlocks: