The audio.cpp framework has released a major expansion adding music generation, SFX generation, and source separation capabilities using C++ and GGML. This update integrates several new models, including ACE-Step 1.5 Turbo, HeartMuLa, Stable Audio 3 Small and Medium, Mel-Band RoFormer, and HTDemucs.

  • The release brings the framework's coverage to 21 out of 28 planned features (75%).
  • HeartMuLa can now generate approximately 10 minutes of audio in a single run, removing previous short limits.
  • ACE-Step Turbo generates 600 seconds of music in 60.16 seconds wall time with a real-time factor of 0.100.
  • A mem_saver mode is available for long-lived usage to reduce resident VRAM after inference.
  • HTDemucs remains slower than the Python path, and Stable Audio warm runs show mixed performance.

The author notes that the current release prioritizes establishing end-to-end paths within the shared framework before optimizing backend-specific performance.