A Reddit user demonstrates how to assemble a local AI inference rig for under $2500 using affordable second-hand components, specifically targeting the ability to run large language models like GLM-5.2 without expensive enterprise hardware.
The proposed build costs approximately $1920 for core components: an Epcy motherboard and CPU ($460), two used NVIDIA Tesla P40 24GB GPUs ($460 total), and 512GB of DDR4 RAM ($1000). An additional budget of $350-$580 covers necessary peripherals like power supply, storage, and cooling, bringing the total to roughly $2500. This configuration supports GLM-5.2 Q2/Q3/Q4 variants via cmoe and llama.cpp, as well as models like Kimi-K2.6 and DeepSeek.
While inference speeds will be slow, making real-time agent usage impractical, the setup allows users to perform planning tasks and serious debugging locally, avoiding reliance on commercial API providers.