Reddit user considers switching from Qwen3.6 35B to Qwen3.5 122B for better general knowledge

A Reddit user is seeking advice on upgrading their local large language model setup, specifically weighing the trade-off between inference speed and general knowledge capabilities.

The user currently runs Qwen3.6 35B as their primary assistant and coding agent on a Strix Halo device.
They report achieving approximately 30-40 tokens per second with a 131k context window.
The user feels the current model lacks basic general knowledge and functions more like an executioner than an assistant.
To address this, they are considering switching to the larger Qwen3.5 122B model while trying to maintain acceptable speed.