OpenRouter's model pricing suggests significant model quantization, as raw inference costs exceed API prices without high throughput or optimized serving. The author argues that unless providers achieve much better efficiency or offer premium, high-fidelity access, quantization likely degrades output quality—especially in complex tasks like planning and coding—raising concerns about transparency and access to true model capability.
OpenRouter model prices imply heavier quantization
from English