Z.ai’s GLM-5.2 tops AI rankings on Huawei chips

China’s Z.ai has put a fresh dent in the idea that export controls can simply freeze the country out of frontier AI. Its new GLM-5.2 model shot to the top of Artificial Analysis rankings, was trained entirely on Huawei Ascend 910B chips rather than Nvidia hardware, and is being released with MIT-licensed weights that anyone can download and run locally.

That combination is awkward for Washington and handy for everyone else: the model looks competitive on several public benchmarks, but its strongest showing is in the kind of practical coding tasks that developers actually pay for. The catch is equally important. Fast, local access is one thing; trust in the cloud API is another, especially for a Chinese company operating under Chinese law.

GLM-5.2’s benchmark results

On Code Arena, GLM-5.2 reached second place overall with a score of 1595 and first place among available models. On SWE-bench Pro, it scored 62.1, ahead of OpenAI’s GPT-5.5 at 58.6. It also took first place on Design Arena. The one obvious blemish: on SWE-Marathon, a harsher long-horizon coding test, it managed 13.0 versus 26.0 for Claude Opus 4.8.

Code Arena: 1595, second overall and first among available models
SWE-bench Pro: 62.1, ahead of GPT-5.5 at 58.6
Design Arena: first place
SWE-Marathon: 13.0, below Claude Opus 4.8 at 26.0

How Huawei Ascend 910B changed the training math

Z.ai says the whole GLM-5 family was trained only on Huawei Ascend 910B processors, with Nvidia excluded by design. That matters because it shows Chinese labs are no longer dependent on American accelerators to produce high-end models, even if the trade-off is slower inference: GLM-5.2 reportedly generates about 17 to 19 tokens per second, compared with 25 to 30 or more on Nvidia-based systems.

The model uses a mixture-of-experts architecture with 744 billion parameters, but only about 40 billion are active per run. A routing system picks 8 of 256 expert subnets for each token, while DeepSeek Sparse Attention helps the model handle a 1 million token context window without the quadratic compute bill that would normally make that kind of scale ridiculous. In plain English: it is built to chew through giant codebases without melting the budget every time you ask it a question.

The price of independence

The training run reportedly took about 15% more compute time than an equivalent Nvidia-based job and cost around $25 million, helped by cheaper Ascend chips and Chinese state subsidies. That is still far less than the bill for comparable frontier-model training in the US, which is exactly why hardware restrictions have had mixed results: they can slow a lab down, but they do not magically stop it from shipping.

There is also a split between open weights and usable deployment. Self-hosting GLM-5.2 gives teams control over their data, but it requires roughly 1.5 terabytes of memory, so this is not a casual weekend experiment. If you use Z.ai’s cloud API instead, you get convenience with a legal asterisk: the company is based in Beijing and operates under Chinese national intelligence, data security, and cybersecurity laws that allow broad state access obligations.

Why export controls are getting awkward

The bigger story is not just one model’s scorecard. Stanford’s 2026 AI Index says the performance gap between leading US and Chinese models has narrowed to 2.7 percentage points, while Epoch AI estimates China still trails by about seven months on the hardest reasoning tests. That is a gap, yes, but it is no longer the moat some policymakers seem to imagine.

Z.ai was placed on a US sanctions list in January 2025, and US lawmakers began probing the security risks of Chinese models in critical infrastructure in May 2026. Yet a model trained on 100,000 Huawei Ascend 910B chips and released with free MIT-licensed weights is a pretty direct rebuttal to the theory that access restrictions alone can keep Chinese AI in the slow lane. The next question is whether GLM-5.2 can stay close to Western closed models in real products, not just benchmark tables; if it can, the policy debate gets a lot less comfortable.

GLM-5.2’s benchmark results

How Huawei Ascend 910B changed the training math

The price of independence

Why export controls are getting awkward

Leave a comment