Considering upgrade from 2 x RTX 3090s to 4 x 5070 TI

A user on r/LocalLLaMA is considering upgrading their hardware setup from two RTX 3090 GPUs to four RTX 5070 Ti cards, specifically evaluating the performance implications for single-stream inference.

The proposed configuration utilizes an Asus Proart Creator B850 Neo motherboard with a PCIe 5.0 4x/4x/4x/4x lane distribution.
Occupying both primary x16 slots splits the CPU's 16 lanes into PCIe 5.0 x8/x8 mode, while two M.2 slots receive dedicated full-speed connections.
The user seeks community feedback on performance for Qwen 3.6 27b using base 4-bit weights and an 8-bit KV-Cache setup.

The discussion highlights skepticism toward Google's conservative predictions that PCIe lanes will bottleneck inference speeds, noting a previous instance where actual speed increases significantly exceeded online estimates.