TencentARC/Pixal3D

Hugging Face Models Trending Models

Summary

Pixal3D is a high-fidelity single-image-to-3D model by TencentARC and Microsoft, which explicitly lifts pixel features into 3D via back-projection for near-reconstruction-level geometry and PBR textures. The model is accepted to SIGGRAPH 2026, with inference code and demo available.

Task: image-to-3d Tags: image-to-3d, arxiv:2605.10922, license:other, region:us
Original Article
View Cached Full Text

Cached at: 05/15/26, 12:16 AM

TencentARC/Pixal3D · Hugging Face

Source: https://huggingface.co/TencentARC/Pixal3D

Pixal3Dgenerates high-fidelity 3D assets from a single image. Unlike previous methods that loosely inject image features via attention, Pixal3D explicitly lifts pixel features into 3D through back-projection, establishing direct pixel-to-3D correspondences. This enables near-reconstruction-level fidelity with detailed geometry and PBR textures.


https://huggingface.co/TencentARC/Pixal3D#%E2%9C%A8-news✨ News

  • May 2026: Release the improved version based onTrellis.2backbone. 💪
  • May 2026: Release inference code and online demo. 🤗
  • Apr 2026: Our paper is accepted to SIGGRAPH 2026! 🎉

https://huggingface.co/TencentARC/Pixal3D#%F0%9F%93%8C-branches📌 Branches

BranchDescriptionmainLatest version— improved implementation based onTrellis.2backbone with better performance.paperPaper version— original implementation based onDirect3D-S2, corresponding to results reported in our SIGGRAPH 2026 paper.

If you want to reproduce the results in our paper, please switch to thepaperbranch.

https://huggingface.co/TencentARC/Pixal3D#%F0%9F%8E%AE-try-it-online🎮 Try It Online

You can try Pixal3D directly in your browser without any installation via our Hugging Face Gradio demo:

👉Launch Demo

https://huggingface.co/TencentARC/Pixal3D#%F0%9F%9A%80-getting-started🚀 Getting Started

https://huggingface.co/TencentARC/Pixal3D#installationInstallation

https://huggingface.co/TencentARC/Pixal3D#step-1-follow-trellis2-installationStep 1: Follow TRELLIS.2 Installation

Please first follow the installation guide ofTRELLIS.2to set up the base environment.

https://huggingface.co/TencentARC/Pixal3D#step-2-install-additional-dependenciesStep 2: Install Additional Dependencies

pip install -r requirements.txt

https://huggingface.co/TencentARC/Pixal3D#step-3-install-utils3dStep 3: Install utils3d

pip install https://github.com/LDYang694/Storages/releases/download/20260430/utils3d-0.0.2-py3-none-any.whl

https://huggingface.co/TencentARC/Pixal3D#usageUsage

https://huggingface.co/TencentARC/Pixal3D#inferenceInference

Generate a GLB mesh from a single image:

python inference.py --image assets/test_image/0.png --output ./output.glb

https://huggingface.co/TencentARC/Pixal3D#web-demoWeb Demo

We provide a Gradio web demo for Pixal3D, which allows you to generate 3D meshes from images interactively.

python app.py

https://huggingface.co/TencentARC/Pixal3D#%F0%9F%A4%97-acknowledgements🤗 Acknowledgements

This project is heavily built uponTrellis.2andDirect3D-S2. We also thank the following repos for their great contributions:Trellis.

https://huggingface.co/TencentARC/Pixal3D#%F0%9F%93%84-citation📄 Citation

If you find this work useful, please consider citing:

@article{li2026pixal3d,
    title   = {Pixal3D: Pixel-Aligned 3D Generation from Images},
    author  = {Li, Dong-Yang and Zhao, Wang and Chen, Yuxin and Hu, Wenbo and Guo, Meng-Hao and Zhang, Fang-Lue and Shan, Ying and Hu, Shi-Min},
    journal = {arXiv preprint arXiv:2605.10922},
    year    = {2026}
}

Similar Articles

Pixal3D: Pixel-Aligned 3D Generation from Images

Hugging Face Daily Papers

Pixal3D introduces a pixel-aligned 3D generation approach that improves fidelity by establishing direct pixel-to-3D correspondences through back-projection conditioning, addressing issues in canonical space generation.

tencentarc/gfpgan

Replicate Explore

GFPGAN is a practical face restoration model by Tencent ARC, available on Replicate. It restores old or low-quality face images with high fidelity.

Lite3R: A Model-Agnostic Framework for Efficient Feed-Forward 3D Reconstruction

Hugging Face Daily Papers

Lite3R is a model-agnostic framework that improves the efficiency of transformer-based 3D reconstruction using sparse linear attention and FP8-aware quantization. It reduces latency and memory usage by up to 2.4x while maintaining geometric accuracy on backbones like VGGT and DA3-Large.

Unified Panoramic Geometry Estimation via Multi-View Foundation Models

Hugging Face Daily Papers

PaGeR adapts the multi-view perspective foundation model Depth Anything 3 to predict scale-invariant and metric depth, surface normals, and sky segmentation from a single equirectangular image, using a fixed cubemap representation that keeps VRAM and runtime constant. The paper also releases the ZüriPano and PanoInfinigen datasets.