Zhipu AI — korshunov.ai

Lab · Zhipu AI

GLM-5.2 is the first open-weights model to achieve 80% accuracy on Terminal-Bench and outperforms all other available open models. It also surpasses Gemini, positioning it as a frontier-level model at a significantly lower cost.

media r/LocalLLaMA · 7d ago

Does anyone have enough compute to make a distillation dataset from GLM5.2?

A user asks if anyone with sufficient computing resources can create a large distillation dataset of 70-1 million examples from GLM5.2. The goal is to enable better training of smaller models like Qwen3.5, benefiting the broader community.

media r/LocalLLaMA · 7d ago

GLM-5.2 Review and Censorship Response

GLM-5.2 demonstrates exceptional long-context coherence and conversational fluency, outperforming Gemini-3.1-Pro on text-only tasks and matching GPT-5.5 in reasoning quality. The model responds factually to sensitive topics like Taiwan and Tiananmen Square, providing detailed historical context without overt censorship, though it adheres to Chinese government content guidelines.

blog Simon Willison · 7d ago

GLM-5.2 is the leading open weights model on the Artificial Analysis Intelligence Index

GLM-5.2, a 753B-parameter text-only model from Z.ai, is now the top open weights model on the Artificial Analysis Intelligence Index, outperforming MiniMax-M3, DeepSeek V4 Pro, and Kimi K2.6. It features a 1 million token context window and ranks second on the Code Arena WebDev leaderboard, despite lacking image input capabilities.

media r/LocalLLaMA · 7d ago

GLM-5.2-FP8 HGX-H200 SGLang Docker Deployment Config

A user shares a Docker configuration for running GLM-5.2-FP8 on HGX-H200 hardware using SGLang. The setup achieves 262k context length and 70 tokens per second with 8 tensor parallelism, using a memory fraction of 0.83. The user notes that vLLM official recipes do not work on H200 due to KV cache FP8 quantization limitations on the DSV3 architecture.

media r/LocalLLaMA · 8d ago

GLM-5.2 is a win for local AI

GLM-5.2, with 753B parameters and a 1M-token context window, is now accessible on local hardware through quantization. Its MIT license and extensive training data enable community fine-tuning of smaller models, promising significant improvements for local AI setups.

media r/LocalLLaMA · 8d ago

GLM-5.2: Built for Long-Horizon Tasks

GLM-5.2 is a language model designed specifically for long-horizon tasks. It aims to better handle complex, multi-step reasoning and long-term planning by improving its ability to maintain context over extended sequences.

media Latent Space · 8d ago

GLM-5.2 Claims Top Position in Frontend Coding with Speculative Decoding

GLM-5.2, a 744B parameter model from Z.ai, has been evaluated as the top frontend coding model globally, outperforming all Opus versions including Opus 4.8. This achievement is highlighted in third-party evaluations that validate official offline tests, marking a significant milestone for a model of its size, particularly in the competitive frontend coding domain.

arxiv arXiv cs.CL · 8d ago

ChLogic: Testing Logical Reasoning Robustness in Chinese Expressions

ChLogic evaluates how well large language models maintain logical reasoning when English logical structures are expressed in Chinese. It reveals a persistent English-Chinese performance gap, with back-translation improving results on general items but harming performance on difficult problems. The benchmark highlights the impact of surface realization, translation artifacts, and model-specific behaviors on multilingual reasoning.

media r/LocalLLaMA · 8d ago

GLM-5.2 Now First on Design Arena

GLM-5.2 has been ranked first on Design Arena, surpassing the previously available Claude Fable 5. The Claude Fable 5 model is now unavailable, contributing to GLM-5.2's top position.

media r/LocalLLaMA · 8d ago

Zhipu surges 33% as Wall Street raises bets on China AI after Anthropic curbs

Zhipu's stock rises 33% following Wall Street's increased interest in China's AI sector. The surge comes after Anthropic, a U.S. AI firm, curtails its operations, prompting market speculation about the competitive dynamics in global AI development.

media r/LocalLLaMA · 8d ago

GLM-5.2 Releases Open Weights with Strong Coding Performance

GLM-5.2 has launched with open weights, a 1M context window, MIT license, and two reasoning modes. Early results show it ranks near the top in coding benchmarks, indicating strong real-world potential beyond API-only models.

media r/LocalLLaMA · 8d ago

GLM 5.2 API Live, Weights on Hugging Face, Ollama Support

GLM 5.2's API is now live, with model weights available on Hugging Face under MIT license and supported by Ollama. The model offers two thinking modes—High and Max—with 1M context length, priced at $1.4 per 1M input tokens and $4.4 per 1M output tokens, matching GLM-5.1.

media r/LocalLLaMA · 9d ago

GLM-5.2 Takes #2 Spot on WebDew Arena

GLM-5.2 has secured the second position in the WebDew Arena benchmarking evaluation. The result reflects its strong performance in natural language understanding and generation tasks compared to other models.

media r/LocalLLaMA · 7d ago

GLM 5.2 Release Video Made with GLM 5.2

A video showcasing GLM 5.2's capabilities was created and shared online. Users note it performs well in web development tasks, though still below top models like Gemini 3.1 Pro in video generation. Long outputs are frequently timed out on OpenRouter, requiring users to switch providers to receive full responses.

media r/LocalLLaMA · 8d ago

GLM 5.2 on 4x Sparks: Reasonable?

A user asks whether running GLM-5.2 on four Ascend GX10 chips (DGX Sparks) is feasible. They inquire about 4-bit quantization using 512GB unified memory and estimate prompt and output token speeds for 100k context length, noting no existing performance data is available online.

media r/LocalLLaMA · 8d ago

GLM-5.2 Max is currently the third best model

GLM-5.2 Max is ranked as the third best model available, across both open and proprietary models. The assessment is based on performance benchmarks and current evaluations in the field of large language models.

media r/LocalLLaMA · 8d ago

Cheapest way to run GLM 5.x locally without unified memory

A user explores cost-effective methods to run GLM 5.x locally using 4-bit quantization, such as IQ4_XS, without relying on unified memory. Options include CPU-only setups like Sapphire Rapids ES with DDR5, multi-GPU offloading, or similar-sized models. The user runs a 5900X + 128GB DDR4 + 7900XT 20GB system, successfully handling Minimax 2.7 at Q4_K_S and Qwen 3.6 27B at IQ4_XS.

media r/LocalLLaMA · 9d ago

GLM-5.2 Now Available on HuggingChat

The GLM-5.2 model is now accessible on HuggingChat. Users can access it via the HuggingFace link provided, enabling direct interaction with the model through the platform.

media r/LocalLLaMA · 9d ago

zai-org Releases GLM-5.2

zai-org has released GLM-5.2, a new large language model. The model is available on Hugging Face and is part of the LocalLLaMA community discussions.

GLM-5.2 crosses 80% on Terminal-Bench

Does anyone have enough compute to make a distillation dataset from GLM5.2?

GLM-5.2 Review and Censorship Response

GLM-5.2 is the leading open weights model on the Artificial Analysis Intelligence Index

GLM-5.2-FP8 HGX-H200 SGLang Docker Deployment Config

GLM-5.2 is a win for local AI

GLM-5.2: Built for Long-Horizon Tasks

GLM-5.2 Claims Top Position in Frontend Coding with Speculative Decoding

ChLogic: Testing Logical Reasoning Robustness in Chinese Expressions

GLM-5.2 Now First on Design Arena

Zhipu surges 33% as Wall Street raises bets on China AI after Anthropic curbs

GLM-5.2 Releases Open Weights with Strong Coding Performance

GLM 5.2 API Live, Weights on Hugging Face, Ollama Support

GLM-5.2 Takes #2 Spot on WebDew Arena

GLM 5.2 Release Video Made with GLM 5.2

GLM 5.2 on 4x Sparks: Reasonable?

GLM-5.2 Max is currently the third best model

Cheapest way to run GLM 5.x locally without unified memory

GLM-5.2 Now Available on HuggingChat

zai-org Releases GLM-5.2