TencentARC/Pixal3D
Summary
Pixal3D is a high-fidelity single-image-to-3D model by TencentARC and Microsoft, which explicitly lifts pixel features into 3D via back-projection for near-reconstruction-level geometry and PBR textures. The model is accepted to SIGGRAPH 2026, with inference code and demo available.
View Cached Full Text
Cached at: 05/15/26, 12:16 AM
TencentARC/Pixal3D · Hugging Face
Source: https://huggingface.co/TencentARC/Pixal3D
Pixal3Dgenerates high-fidelity 3D assets from a single image. Unlike previous methods that loosely inject image features via attention, Pixal3D explicitly lifts pixel features into 3D through back-projection, establishing direct pixel-to-3D correspondences. This enables near-reconstruction-level fidelity with detailed geometry and PBR textures.
https://huggingface.co/TencentARC/Pixal3D#%E2%9C%A8-news✨ News
- May 2026: Release the improved version based onTrellis.2backbone. 💪
- May 2026: Release inference code and online demo. 🤗
- Apr 2026: Our paper is accepted to SIGGRAPH 2026! 🎉
https://huggingface.co/TencentARC/Pixal3D#%F0%9F%93%8C-branches📌 Branches
BranchDescriptionmainLatest version— improved implementation based onTrellis.2backbone with better performance.paperPaper version— original implementation based onDirect3D-S2, corresponding to results reported in our SIGGRAPH 2026 paper.
If you want to reproduce the results in our paper, please switch to the
paperbranch.
https://huggingface.co/TencentARC/Pixal3D#%F0%9F%8E%AE-try-it-online🎮 Try It Online
You can try Pixal3D directly in your browser without any installation via our Hugging Face Gradio demo:
https://huggingface.co/TencentARC/Pixal3D#%F0%9F%9A%80-getting-started🚀 Getting Started
https://huggingface.co/TencentARC/Pixal3D#installationInstallation
https://huggingface.co/TencentARC/Pixal3D#step-1-follow-trellis2-installationStep 1: Follow TRELLIS.2 Installation
Please first follow the installation guide ofTRELLIS.2to set up the base environment.
https://huggingface.co/TencentARC/Pixal3D#step-2-install-additional-dependenciesStep 2: Install Additional Dependencies
pip install -r requirements.txt
https://huggingface.co/TencentARC/Pixal3D#step-3-install-utils3dStep 3: Install utils3d
pip install https://github.com/LDYang694/Storages/releases/download/20260430/utils3d-0.0.2-py3-none-any.whl
https://huggingface.co/TencentARC/Pixal3D#usageUsage
https://huggingface.co/TencentARC/Pixal3D#inferenceInference
Generate a GLB mesh from a single image:
python inference.py --image assets/test_image/0.png --output ./output.glb
https://huggingface.co/TencentARC/Pixal3D#web-demoWeb Demo
We provide a Gradio web demo for Pixal3D, which allows you to generate 3D meshes from images interactively.
python app.py
https://huggingface.co/TencentARC/Pixal3D#%F0%9F%A4%97-acknowledgements🤗 Acknowledgements
This project is heavily built uponTrellis.2andDirect3D-S2. We also thank the following repos for their great contributions:Trellis.
https://huggingface.co/TencentARC/Pixal3D#%F0%9F%93%84-citation📄 Citation
If you find this work useful, please consider citing:
@article{li2026pixal3d,
title = {Pixal3D: Pixel-Aligned 3D Generation from Images},
author = {Li, Dong-Yang and Zhao, Wang and Chen, Yuxin and Hu, Wenbo and Guo, Meng-Hao and Zhang, Fang-Lue and Shan, Ying and Hu, Shi-Min},
journal = {arXiv preprint arXiv:2605.10922},
year = {2026}
}
Similar Articles
Pixal3D: Pixel-Aligned 3D Generation from Images
Pixal3D introduces a pixel-aligned 3D generation approach that improves fidelity by establishing direct pixel-to-3D correspondences through back-projection conditioning, addressing issues in canonical space generation.
tencentarc/gfpgan
GFPGAN is a practical face restoration model by Tencent ARC, available on Replicate. It restores old or low-quality face images with high fidelity.
Lite3R: A Model-Agnostic Framework for Efficient Feed-Forward 3D Reconstruction
Lite3R is a model-agnostic framework that improves the efficiency of transformer-based 3D reconstruction using sparse linear attention and FP8-aware quantization. It reduces latency and memory usage by up to 2.4x while maintaining geometric accuracy on backbones like VGGT and DA3-Large.
@HowToAI_: Microsoft has released a 4B parameter model that turns any image into a 3D asset in 3 seconds. It uses a new geometry f…
Microsoft released a 4B parameter model that converts any image into a 3D asset in 3 seconds, using the O-Voxel geometry format and outputting GLB files with full PBR textures, compatible with Blender, Unity, and Unreal.
Unified Panoramic Geometry Estimation via Multi-View Foundation Models
PaGeR adapts the multi-view perspective foundation model Depth Anything 3 to predict scale-invariant and metric depth, surface normals, and sky segmentation from a single equirectangular image, using a fixed cubemap representation that keeps VRAM and runtime constant. The paper also releases the ZüriPano and PanoInfinigen datasets.