JanusMesh: Fast and Zero-Shot 3D Visual Illusion Generation via Cross-Space Denoising

Hugging Face Daily Papers 06/18/26, 12:00 AM Papers

3d-generation visual-illusions zero-shot training-free diffusion mesh-generation text-driven

Summary

JanusMesh is a fast, training-free framework that generates text-driven 3D visual illusions—a single mesh revealing different semantics from different viewing angles—by decoupling generation into cross-space dual-branch denoising and view-conditioned texture synthesis, achieving high realism in just 3-5 minutes.

Creating 3D visual illusions, a single 3D mesh that reveals entirely different semantics from various viewing angles, is a fascinating but tough challenge. Existing optimization-based methods are slow and can produce oversaturated colors. In contrast, naive stitching approaches fail to produce geometrically coherent objects. This results in visible unnatural seams and semantic leaks. In this paper, we present a fast and training-free framework for generating text-driven 3D visual illusions. Our approach decouples the generation into two stages. First, we propose a cross-space dual-branch denoising process. This process dynamically decodes 3D latents into voxel space for CLIP-guided orientation alignment and Signed Distance Field (SDF) blending, which ensures seamless geometric fusion. Second, we introduce a view-conditioned texture synthesis module that projects and aggregates view-specific 2D diffusion priors onto the fused geometry. Extensive experiments demonstrate that our method generates highly realistic, dual-semantic 3D illusions in just 3-5 minutes. It significantly outperforms existing methods in geometric integrity, semantic recognizability, and efficiency. Project page: https://siang1105.github.io/JanusMesh.github.io/

Original Article

View Cached Full Text

Cached at: 06/20/26, 02:28 PM

Paper page - JanusMesh: Fast and Zero-Shot 3D Visual Illusion Generation via Cross-Space Denoising

Source: https://huggingface.co/papers/2606.20563

Abstract

A fast, training-free framework generates text-driven 3D visual illusions by decoupling generation into cross-space dual-branch denoising and view-conditioned texture synthesis for seamless geometric fusion and semantic coherence.

Creating 3D visual illusions, a single 3D mesh that reveals entirely different semantics from various viewing angles, is a fascinating but tough challenge. Existing optimization-based methods are slow and can produce oversaturated colors. In contrast, naive stitching approaches fail to produce geometrically coherent objects. This results in visible unnatural seams and semantic leaks. In this paper, we present a fast and training-free framework for generating text-driven 3D visual illusions. Our approach decouples the generation into two stages. First, we propose across-space dual-branch denoising process. This process dynamically decodes3D latentsintovoxel spaceforCLIP-guided orientation alignmentandSigned Distance Field(SDF) blending, which ensures seamlessgeometric fusion. Second, we introduce aview-conditioned texture synthesismodule that projects and aggregates view-specific2D diffusion priorsonto the fused geometry. Extensive experiments demonstrate that our method generates highly realistic, dual-semantic 3D illusions in just 3-5 minutes. It significantly outperforms existing methods in geometric integrity, semantic recognizability, and efficiency. Project page: https://siang1105.github.io/JanusMesh.github.io/

View arXiv page View PDF Project page GitHub16 Add to collection

Get this paper in your agent:

hf papers read 2606\.20563

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.20563 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.20563 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.20563 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

JanusMesh: Fast and Zero-Shot 3D Visual Illusion Generation via Cross-Space Denoising

Paper page - JanusMesh: Fast and Zero-Shot 3D Visual Illusion Generation via Cross-Space Denoising

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

UniMesh: Unifying 3D Mesh Understanding and Generation

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

Fast 4D Mesh Generation by Spatio-Temporal Attention Chains

MeshWeaver: Sparse-Voxel-Guided Surface Weaving for Autoregressive Mesh Generation

Helix4D: Complex 4D Mesh Generation

Submit Feedback

Similar Articles

UniMesh: Unifying 3D Mesh Understanding and Generation

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

Fast 4D Mesh Generation by Spatio-Temporal Attention Chains

MeshWeaver: Sparse-Voxel-Guided Surface Weaving for Autoregressive Mesh Generation

Helix4D: Complex 4D Mesh Generation