Tag
A developer built a local voice-controlled music system using an ESP32 microcontroller, a MacBook, Magenta Realtime 2 for real-time music generation, MLX Whisper for transcription, and a Qwen model for tool calling, enabling conversational control over music elements like genre and instruments.
Google released Magenta RealTime 2 on Hugging Face, an open-weights model for real-time continuous music generation on device with ~200ms latency, steerable by text, audio, or MIDI.