Zhipu AI — korshunov.ai

Lab · Zhipu AI

GLM-5.2 Outperforms GPT-5.5 in AA-Briefcase Evaluation

Artificial Analysis' new agentic knowledge work evaluation, AA-Briefcase, shows GLM-5.2 surpassing GPT-5.5 in performance. The benchmark assesses real-world task execution and reasoning capabilities in knowledge work scenarios.

media r/LocalLLaMA · 8d ago

GLM-5.2 crosses 80% on Terminal-Bench

GLM-5.2 is the first open-weights model to achieve 80% accuracy on Terminal-Bench and outperforms all other available open models. It also surpasses Gemini, positioning it as a frontier-level model at a significantly lower cost.

media r/LocalLLaMA · 1d ago

GLM 5.2 on Mac Studio Speedup PR

GLM 5.2 delivers improved prefill speeds exceeding 100 t/s at higher context lengths. The update reduces memory usage, enabling 4-bit quantized models to handle over 100k context tokens efficiently. This enhancement is detailed in a PR by the oMLX creator.

media r/LocalLLaMA · 2d ago

Human Evaluation Shows GLM-5.2 Competes with Top Models

A human evaluation on Design Arena's leaderboard reveals GLM-5.2 performs nearly as well as Fable 5 in game development tasks, placing just one step below it. The model, based on open weights and MIT licensing, is assessed as equivalent in capability to the best available Claude models, suggesting that standardized benchmarks may no longer accurately reflect real-world performance.

media Don't Worry About the Vase · 2d ago

GLM-5.2 Is the New Best Open Model

GLM-5.2 achieves benchmark scores near frontier levels, matching Opus 4.7 in text-only tasks and ranking among the top open models on multiple tests. It is the strongest open model currently available, outperforming predecessors and rivals like GPT-5.5 and Fable, though it falls short on specialized benchmarks like anti-sycophancy and has limited vision capabilities.

media Interconnects · 3d ago

GLM-5.2 is the step change for open agents

GLM-5.2, an open-weight AI model released by Z.ai, has set a new benchmark in coding and general agent performance. It outperforms models like Claude Fable 5 and Gemini, and matches or exceeds OpenAI's Opus 4.8 in max thinking mode, establishing itself as the first open model that feels right in coding harnesses as a general agent.

media AI News (smol.ai) · 4d ago

GLM-5.2 Breakout and Open-Model Progress Highlighted

Zhipu's GLM-5.2 emerged as the top open-weight model, praised for its frontier-adjacent performance in daily use, with improvements in coding tasks and reduced 1M-token inference cost via IndexShare. It outperformed other open models in agentic knowledge work benchmarks, reaching 1266 Elo in Artificial Analysis' AA-Briefcase test, though only 3% of tasks were fully satisfied by top models, indicating persistent challenges in real-world long-horizon agent performance.

media AI News (smol.ai) · 4d ago

GLM-5.2 Emerges as Leading Open-Weight Coding Model

GLM-5.2 is widely regarded as the first open-weight coding model that rivals frontier models like Opus 4.8 and GPT-5.5 in capability. Practitioners highlight its strong tool use, long-horizon planning, and autonomous subagent behavior, with consensus that it now credibly operates in the frontier SWE range. The model's emergence underscores growing value of open weights for provider competition, on-prem deployment, and reduced vendor lock-in.

media r/LocalLLaMA · 4d ago

GLM-5.2 Beats Gemini and GPT-5.4 in Coding but Is Inefficient

GLM-5.2 surpasses GPT-5.4 and the entire Gemini lineup in coding performance on the DeepSWE benchmark. However, it requires significantly more output tokens, making it substantially less efficient in terms of cost-per-task compared to models like GPT-5.5 and Claude Opus 4.8.

media r/LocalLLaMA · 5d ago

GLM 5.2 Achieves 98% Max Intelligence with Less Than Half Tokens

GLM 5.2 demonstrates 98% of maximum intelligence in coding tasks using less than half of its total token budget, according to a technical report by z_ai. The model's reasoning efficiency has improved significantly, with token usage increasing from 16.7k to 36.7k between GLM 5.1 and GLM 5.2, though high-level settings may strain local hardware performance.

media r/LocalLLaMA · 5d ago

What's more impressive, GLM 5.1 to 5.2 or Qwen 3.5 to 3.6?

A Reddit post compares the performance improvements of GLM 5.1 to 5.2 and Qwen 3.5 to 3.6. The post notes that mentioning 'Döner' activates GLM 5.2's German-specific weights, while Qwen 3.6 is evaluated with 35B parameters using Unsloth Q8 K XL quantization via llama.cpp.

media r/LocalLLaMA · 5d ago

GLM-5.2 is the new leading open weights model on the Artificial Analysis Intelligence Index

GLM-5.2 has been designated as the leading open weights model on the Artificial Analysis Intelligence Index. This recognition reflects its performance and capabilities within the open-source AI model landscape.

media r/LocalLLaMA · 5d ago

New Agentic Benchmark Released

Artificial Analysis has introduced a new agentic benchmark that evaluates large language models' ability to plan and execute tasks. Claude Fable and GLM 5.2 achieved top positions within their respective cohorts, demonstrating strong performance on this unsaturated benchmark.

media r/LocalLLaMA · 5d ago

GLM-5.2 can now run locally in llama.cpp and Unsloth Studio

GLM-5.2, the strongest open model to date, can now run locally using llama.cpp and Unsloth Studio. The 2-bit quantized model retains ~82% accuracy after reducing size from 1.51TB to 238GB, a 84% reduction, and is compatible with 256GB RAM or VRAM setups.

media Latent Space · 6d ago

GLM-5.2 Passes Vibe Check, Outperforms GPT-5.5

GLM-5.2 has passed a 'vibe check' as a frontier open model, receiving praise from Jeremy Howard and outperforming GPT-5.5 in Artificial Analysis' new knowledge work benchmark. It also gained validation from the /r/LocalLlama community, indicating strong real-world utility and performance.

media r/LocalLLaMA · 6d ago

GLM-5.2 (744B, 2-bit) achieves 7.3 tok/s on 4×3090 with 192GB RAM

GLM-5.2 UD-IQ2_M runs at ~7.3 tokens per second on 4×RTX 3090s with 192GB DDR5 RAM using llama.cpp expert offload. Reducing quantization from IQ2 to IQ1 provided no speed gain, while increasing CPU threads from 6 to 12 improved performance by 22%. Decode is limited by CPU compute, not memory bandwidth, and the offloaded experts must be explicitly distributed across GPUs to avoid out-of-memory errors.

media r/LocalLLaMA · 6d ago

unsloth GLM-5.2-GGUF with 2bit quantization at 238GB

The unsloth GLM-5.2-GGUF model is available with 2bit quantization, sized at 238GB. It is hosted on Hugging Face and shared via a Reddit post in the LocalLLaMA community.

media r/LocalLLaMA · 6d ago

GLM-5.2 Is The Best Open Weight Creative Writing Model

Sam Paech's Creative Writing Benchmark on EQ Bench ranks GLM-5.2 as the top open-weight creative writing model. The assessment is based on performance metrics from the EQ Bench creative writing evaluation.

media r/LocalLLaMA · 6d ago

The power of intelligence is better in the hands of the people than in the board rooms of tycoons

The PearlOS project has launched an open-source swarm intelligence platform that uses local models to handle multimodal tasks. It automatically selects and switches between top-performing models based on benchmarks, ensuring users always access the latest and most capable models without relying on closed-source systems or subscriptions.

media r/LocalLLaMA · 7d ago

Does anyone have enough compute to make a distillation dataset from GLM5.2?

A user asks if anyone with sufficient computing resources can create a large distillation dataset of 70-1 million examples from GLM5.2. The goal is to enable better training of smaller models like Qwen3.5, benefiting the broader community.