media r/LocalLLaMA · 4d ago · open_models

GLM-5.2 Beats Gemini and GPT-5.4 in Coding but Is Inefficient

from English

GLM-5.2 surpasses GPT-5.4 and the entire Gemini lineup in coding performance on the DeepSWE benchmark. However, it requires significantly more output tokens, making it substantially less efficient in terms of cost-per-task compared to models like GPT-5.5 and Claude Opus 4.8.

Importance 2/3 Beats a top-lab benchmark r/LocalLLaMA Zhipu AI Mistral AI OpenAI Code generation Evaluation & benchmarks Open weights

Benchmarks

Benchmark	Model	Score
SWE-bench Verified	GLM-5.2	0%

Read original