UniMesh: Unifying 3D Mesh Understanding and Generation

Hugging Face Daily Papers Papers

Summary

UniMesh introduces a single model that jointly handles 3D mesh generation and understanding via a Mesh Head, Chain-of-Mesh iterative editing, and a self-reflection error-correction mechanism.

Recent advances in 3D vision have led to specialized models for either 3D understanding (e.g., shape classification, segmentation, reconstruction) or 3D generation (e.g., synthesis, completion, and editing). However, these tasks are often tackled in isolation, resulting in fragmented architectures and representations that hinder knowledge transfer and holistic scene modeling. To address these challenges, we propose UniMesh, a unified framework that jointly learns 3D generation and understanding within a single architecture. First, we introduce a novel Mesh Head that acts as a cross model interface, bridging diffusion based image generation with implicit shape decoders. Second, we develop Chain of Mesh (CoM), a geometric instantiation of iterative reasoning that enables user driven semantic mesh editing through a closed loop latent, prompting, and re generation cycle. Third, we incorporate a self reflection mechanism based on an Actor Evaluator Self reflection triad to diagnose and correct failures in high level tasks like 3D captioning. Experimental results demonstrate that UniMesh not only achieves competitive performance on standard benchmarks but also unlocks novel capabilities in iterative editing and mutual enhancement between generation and understanding. Code: https://github.com/AIGeeksGroup/UniMesh. Website: https://aigeeksgroup.github.io/UniMesh.
Original Article
View Cached Full Text

Cached at: 04/22/26, 06:17 AM

Paper page - UniMesh: Unifying 3D Mesh Understanding and Generation

Source: https://huggingface.co/papers/2604.17472

Abstract

UniMesh presents a unified framework that combines 3D generation and understanding tasks through novel components including a Mesh Head, Chain of Mesh for iterative editing, and a self-reflection mechanism for error correction.

Recent advances in3D visionhave led to specialized models for either 3D understanding (e.g., shape classification, segmentation, reconstruction) or 3D generation (e.g., synthesis, completion, and editing). However, these tasks are often tackled in isolation, resulting in fragmented architectures and representations that hinder knowledge transfer and holistic scene modeling. To address these challenges, we propose UniMesh, a unified framework that jointly learns 3D generation and understanding within a single architecture. First, we introduce a novelMesh Headthat acts as a cross model interface, bridgingdiffusion based image generationwithimplicit shape decoders. Second, we developChain of Mesh(CoM), a geometric instantiation ofiterative reasoningthat enables user drivensemantic mesh editingthrough a closed loop latent, prompting, and re generation cycle. Third, we incorporate a self reflection mechanism based on anActor Evaluator Self reflection triadto diagnose and correct failures in high level tasks like3D captioning. Experimental results demonstrate that UniMesh not only achieves competitive performance on standard benchmarks but also unlocks novel capabilities in iterative editing and mutual enhancement between generation and understanding. Code: https://github.com/AIGeeksGroup/UniMesh. Website: https://aigeeksgroup.github.io/UniMesh.

View arXiv pageView PDFProject pageGitHub5Add to collection

Get this paper in your agent:

hf papers read 2604\.17472

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2604.17472 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2604.17472 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2604.17472 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Similar Articles

Uni-Edit: Intelligent Editing Is A General Task For Unified Model Tuning

Hugging Face Daily Papers

Uni-Edit proposes using intelligent image editing as a single general task to simultaneously improve unified multimodal models' understanding, generation, and editing capabilities, with an automated data synthesis pipeline creating complex editing instructions.