Consider post-training instead of benchmarking for new hardware

The author argues that acquiring new hardware should be used for supervised fine-tuning (SFT) and reinforcement fine-tuning (RFT) rather than standard model benchmarking. This approach offers a viable path to monetization by leveraging open models, especially as proprietary APIs become less accessible or more expensive.

Post-training requires balancing quality and speed, with data mixing and synthesis being critical for performance.
Model characteristics significantly impact training; Qwen models are difficult to fine-tune due to saturated knowledge, while Llama models absorb new information more easily.
Reinforcement fine-tuning involves a complex mix of inference rollouts and weight updates using methods like PPO or GRPO.
Engineering skills are essential for building low-power, massively parallel stacks that allow for rapid iteration cycles.

Custom post-training is presented as one of the few remaining opportunities in the open model space, offering potential income despite being competitive and hardware-dependent.