Andi from Hugging Face has released a fully open-source and free-to-use demo that creates a voice interaction pipeline. The system integrates Nvidia's parakeet, the Gemma 4 31B model served by Cerebras, and custom inference for Qwen3TTS.

  • The stack functions as a drop-in replacement for OpenAI's realtime API.
  • It is designed to see and search the web with low latency.
  • Local execution is supported, with similar latencies achieved on a MacBook Pro M3 36GB using Gemma 4 E4B.
  • A cloud-based web demo is available at hf-realtime-voice on Hugging Face Spaces.

This pipeline enables users to run local voice interactions and serves as the underlying technology for Reachy Minis.