A user reports that Gemma 4 26B quantized to Q3 runs at 25 tokens per second on a MacBook Air, performing nearly as well as bf16 for non-coding, tool-calling tasks. They question whether this performance reflects confirmation bias or if small quantized models are genuinely usable.