@osanseviero: Super excited to introduce Gemma 4 12B! - Multimodal: audio, image, video, and text input - Novel architecture: we remo…

X AI KOLs Timeline Models

Summary

Introducing Gemma 4 12B, a multimodal model supporting audio, image, video, and text input with a novel unified architecture and a new MacOS desktop app powered by LiteRT.

Super excited to introduce Gemma 4 12B! 💎 - Multimodal: audio, image, video, and text input - Novel architecture: we removed the multimodal encoders for a unified, streamlined arch - New MacOS desktop app powered by LiteRT - MTP support Excited to see what you build with it! https://t.co/De5id2XQfz
Original Article
View Cached Full Text

Cached at: 06/03/26, 05:52 PM

Super excited to introduce Gemma 4 12B! 💎

  • Multimodal: audio, image, video, and text input
  • Novel architecture: we removed the multimodal encoders for a unified, streamlined arch
  • New MacOS desktop app powered by LiteRT
  • MTP support

Excited to see what you build with it! https://t.co/De5id2XQfz

Similar Articles