The author has released web-based and Python versions of enhancements to Kokoro's voice controls, designed to be easily ported into other projects. Both implementations are fully client-side, with the web version achieving approximately 40ms per generation when hardware acceleration is enabled via WebGPU.
- The project includes both a web interface (kokoro-lab-web) and a Python library (kokoro-lab-py).
- The GitHub page loads the 300MB Kokoro FP32 model directly from Hugging Face.
- The enhancements focus on improved voice controls to address limitations seen in existing Kokoro projects.
These minimal versions are provided for developers to integrate improved control mechanisms into their own applications.