The author introduces USAF, a new sparse fine-tuning method for Mixture of Experts (MoE) models designed to allow fine-tuning on hardware capable only of inference.

  • The method trains sparse expert weights and the router instead of using adapters.
  • It allows fine-tuning of Qwen3-30B-A3B on an AMD RX 6750 XT with 12 GB of VRAM.
  • The project is open source under the Apache 2.0 license.

This approach aims to democratize access to MoE model customization by removing the high hardware requirements typically associated with fine-tuning.