Tag
GPT-Realtime-2 demonstrates a 15 percentage point improvement over version 1.5 on the Big Bench Audio benchmark, approaching saturation levels.
APEX is a large-scale multi-task learning framework that predicts both popularity and aesthetic quality of AI-generated music using frozen audio embeddings. The model demonstrates strong generalization across different generative architectures by jointly predicting engagement signals and perceptual quality dimensions.
Google has released Lyria 3, its newest music generation model, available to developers through the Gemini API and Google AI Studio. The model offers two variants: Lyria 3 Pro for full songs and Lyria 3 Clip for shorter clips, with controls for tempo, lyrics, and image-to-music multimodal input.
Google has developed DolphinGemma, a large language model designed to learn and generate dolphin vocalizations, collaborating with Georgia Tech and the Wild Dolphin Project to advance understanding of dolphin communication patterns and enable potential interspecies dialogue.