The audio.cpp framework has released a major expansion adding music generation, SFX generation, and source separation capabilities using C++ and GGML. This update integrates several new models, including ACE-Step 1.5 Turbo, HeartMuLa, Stable Audio 3 Small and Medium, Mel-Band RoFormer, and HTDemucs.
- The release brings the framework's coverage to 21 out of 28 planned features (75%).
- HeartMuLa can now generate approximately 10 minutes of audio in a single run, removing previous short limits.
- ACE-Step Turbo generates 600 seconds of music in 60.16 seconds wall time with a real-time factor of 0.100.
- A mem_saver mode is available for long-lived usage to reduce resident VRAM after inference.
- HTDemucs remains slower than the Python path, and Stable Audio warm runs show mixed performance.
The author notes that the current release prioritizes establishing end-to-end paths within the shared framework before optimizing backend-specific performance.