Benchmark · coding
HumanEval+
saturated
2 results
1 models
DeepSeek-Coder-1.3B
Timeline
-
2026-06-16
DeepSeek-Coder-1.3B
12.0tasks
Post-Hoc Operators Fail to Improve Accuracy in Small Code Models
-
2026-06-16
DeepSeek-Coder-1.3B
12.0tasks
Post-Hoc Falsification Operators Fail to Improve Accuracy in Small Code Models