facebook/VGGT-Omega

Hugging Face Models Trending 03/17/26, 08:47 PM Models

3d-vision foundation-model meta-ai oxford-vgg computer-vision open-source cvpr-2026

Summary

Meta AI and Oxford VGG released VGGT-Omega, a foundation model for 3D vision, with project page and GitHub repository.

Tags: facebook, meta-pytorch, en, license:cc-by-nc-4.0, region:us

Original Article

View Cached Full Text

Cached at: 05/19/26, 06:34 PM

facebook/VGGT-Omega · Hugging Face

Source: https://huggingface.co/facebook/VGGT-Omega

Meta AI Research;University of Oxford, VGG

Jianyuan Wang,Minghao Chen,Shangzhan Zhang,Nikita Karaev, Johannes Schönberger,Patrick Labatut,Piotr Bojanowski,David Novotny, Andrea Vedaldi,Christian Rupprecht

https://huggingface.co/facebook/VGGT-Omega#quick-startQuick Start

Please refer to ourGithub Repo

https://huggingface.co/facebook/VGGT-Omega#citationCitation

If you find our repository useful, please consider giving it a star ⭐ and citing our paper in your work:

@inproceedings{wang2026vggtomega,
  title={VGGT-{$\Omega$}},
  author={Wang, Jianyuan and Chen, Minghao and Zhang, Shangzhan and Karaev, Nikita and Sch{\"o}nberger, Johannes and Labatut, Patrick and Bojanowski, Piotr and Novotny, David and Vedaldi, Andrea and Rupprecht, Christian},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2026}
}

Similar Articles

Nvidia Cosmos 3

Hacker News Top

NVIDIA has open-sourced Cosmos 3, a frontier foundation model for physical AI that unifies reasoning, world generation, and action generation within a single Mixture-of-Transformers architecture, releasing model checkpoints, datasets, and training scripts for robotics, autonomous vehicles, and warehouse monitoring.

@AdinaYakup: MOSS-VL Vision model from @Open_MOSS Model: https://huggingface.co/collections/OpenMOSS-Team/moss-vl… Demo: https://hug…

X AI KOLs Following

Open_MOSS released MOSS-VL, an 11B Apache 2.0 vision-language model using cross-attention and XRoPE that outperforms Qwen3-VL-8B by 8.3 points on VSI-bench.

Just open-sourced FastVLA

Reddit r/LocalLLaMA

FastVLA, an open-source Vision-Language-Action model, now runs 5 Hz robotics on an L4 GPU.

GPT-4V(ision) system card

OpenAI Blog

OpenAI releases a system card detailing the safety properties and evaluations of GPT-4V(ision), which adds image input capabilities to GPT-4, enabling multimodal instruction-following and vision analysis.

Welcome NVIDIA Cosmos 3: The First Open Omni-model for Physical AI Reasoning and Action

Hugging Face Blog

NVIDIA Cosmos 3 is an open omni-model for physical AI that unifies world generation, reasoning, and action generation into a single model, available on Hugging Face with various resources.

Submit Feedback