All articles
media r/LocalLLaMA · 10d ago

Can I realistically get close to Claude/Codex capabilities locally?

A user with a 32GB system asks if open-weight models can match Opus 4.8's 1M context and coding performance on local hardware. They note current bottlenecks are context length and privacy concerns, and question whether high-end models like GLM 5.2 or Qwen3.7 are feasible within a $3.5K budget, emphasizing that running 70-80B models offers marginal real-world gains over 27B models with 256K context.

media r/LocalLLaMA · 10d ago

Running MiMo-2.5 on Two Halo Strixeses

A user reports running MiMo-2.5 on two 128GB machines with Intel 8060 processors, using Proxmox containers and USB4Net for connectivity. The setup achieves 356pp and 15tg performance at 1% or 10k context length, though the user questions whether this is viable or elite-tier performance. They also note difficulties building vLLM and sglang for consumer hardware, stating vLLM is unreliable and sglang is designed for datacenters, not personal systems.