GLM-5.2 has been evaluated on the DeepSWE benchmark, with performance highlighted in the top-right corner of the visualization. The post notes that scores decrease as price increases, and points to the DeepSWE website and ArtificialAnalysis for alternate evaluations, while addressing criticisms and historical context around benchmark validity.
GLM-5.2 Released on DeepSWE Benchmark
from English