Show HN: Real-time AI (audio/video in, voice out) on an M3 Pro with Gemma E2B (github.com) AI

The GitHub project “parlor” showcases an early, on-device system for real-time multimodal AI conversations, using a browser mic/camera input stream and replying with streamed audio. It runs locally via a FastAPI WebSocket server that performs speech and vision understanding with Gemma 4 E2B (LiteRT-LM) and text-to-speech with Kokoro. The demo targets Apple Silicon (e.g., M3 Pro) or Linux with a supported GPU and emphasizes hands-free features like voice activity detection and barge-in (interrupting mid-response).

April 06, 2026 05:36 Source: Hacker News