A satellite is now running Google's Gemma 3 vision-language model in orbit, doing onboard inference instead of downlinking everything first

Reddit r/singularity 06/19/26, 01:37 PM Models

satellite space-ai onboard-inference gemma-3 vision-language-model edge-inference bandwidth-optimization

Summary

Loft Orbital's YAM-9 satellite runs Google's Gemma 3 vision-language model onboard for real-time image analysis, reducing downlink bandwidth and latency by deciding what data to send to Earth.

Loft Orbital's YAM-9 is running Gemma 3 onboard, reportedly the first vision-language model deployed in orbit. Rather than streaming every image down for ground analysis, the satellite reasons about what it is seeing in space and decides what is worth sending. The practical win is bandwidth and latency: downlink windows are scarce and expensive, so a satellite that can identify and prioritize on its own changes what is even worth the radio time. Edge inference where the edge happens to be low Earth orbit. Source: https://aiweekly.co/alerts/loft-orbital-yam-9-satellite-deploys-gemma-3-ai-onboard

Original Article

Similar Articles

A satellite just learned to find things on its own — here’s what that means

TechCrunch AI

A satellite called Yam-9 used Google DeepMind's Gemma 3 vision-language model in orbit to autonomously identify areas of interest based on natural language queries, marking the first reported use of a VLM in space and signaling a shift toward more autonomous satellite operations.

NAVI-Orbital: First In-Orbit Demonstration of a Zero-Shot Vision-Language Model for Autonomous Earth Observation

arXiv cs.AI

NAVI-Orbital demonstrates the first in-orbit deployment of a zero-shot vision-language model (Gemma 3) on a LEO satellite, enabling autonomous scene classification and semantic compression of Earth observation data without fine-tuning.

Introducing Gemma 3

Google DeepMind Blog

Google introduces Gemma 3, a collection of lightweight open models (1B, 4B, 12B, 27B) designed to run on single GPUs or TPUs, featuring support for 140+ languages, 128k context window, and multimodal capabilities. The models outperform larger competitors like Llama 3 and DeepSeek-V3 while maintaining efficiency for on-device deployment.

Google's new Gemma 4 12B model is designed to run on any laptop with 16GB of RAM

Ars Technica

Google releases Gemma 4 12B, a compact AI model optimized for local laptop use with only 16GB of RAM, featuring multi-token prediction and streamlined multimodal capabilities for text, audio, and images.

Gemma 4 VLA Demo on Jetson Orin Nano Super

Hugging Face Blog

NVIDIA and Hugging Face publish a hands-on demo showing Gemma 4 running as a vision-language-action model entirely on the Jetson Orin Nano Super, using local STT/TTS and webcam input.