The author introduces USAF, a new sparse fine-tuning method for Mixture of Experts (MoE) models designed to allow fine-tuning on hardware capable only of inference.
- The method trains sparse expert weights and the router instead of using adapters.
- It allows fine-tuning of Qwen3-30B-A3B on an AMD RX 6750 XT with 12 GB of VRAM.
- The project is open source under the Apache 2.0 license.
This approach aims to democratize access to MoE model customization by removing the high hardware requirements typically associated with fine-tuning.