media r/LocalLLaMA · 7d ago · open_models

SIQ-1 Qwen3.6 Achieves Strong Performance in Autoresearch and Benchmarking

from English

The SIQ-1 model, trained using PPO with verifiable reward, outperforms GLM-5.2 and Qwen-350B on parameter-golf tasks, with outputs resembling Opus4.8. It also beats NEX and GPT-5.5 on the bullshit-bench test. The model and GGUF version are available on Hugging Face, along with a ZeroGPU-compatible agent demo.

Importance 2/3 Beats a top-lab benchmark r/LocalLLaMA Alibaba (Qwen) AI agents Benchmark results Reasoning models

Read original