Andi from Hugging Face has released a fully open-source and free-to-use demo that creates a voice interaction pipeline. The system integrates Nvidia's parakeet, the Gemma 4 31B model served by Cerebras, and custom inference for Qwen3TTS.
- The stack functions as a drop-in replacement for OpenAI's realtime API.
- It is designed to see and search the web with low latency.
- Local execution is supported, with similar latencies achieved on a MacBook Pro M3 36GB using Gemma 4 E4B.
- A cloud-based web demo is available at hf-realtime-voice on Hugging Face Spaces.
This pipeline enables users to run local voice interactions and serves as the underlying technology for Reachy Minis.