high-resolution

#high-resolution

L2P: Unlocking Latent Potential for Pixel Generation

Hugging Face Daily Papers ↗ · yesterday Cached

The L2P paper introduces a Latent-to-Pixel transfer paradigm that leverages pre-trained latent diffusion models to create efficient pixel-space models capable of 4K generation with minimal training overhead.

0 favorites 0 likes

#high-resolution

LLaVA-UHD v4: What Makes Efficient Visual Encoding in MLLMs?

Hugging Face Daily Papers ↗ · 4d ago Cached

This paper introduces LLaVA-UHD v4, which improves visual encoding efficiency in multimodal large language models by using slice-based encoding and intra-ViT early compression. It reduces computational costs by over 55% while maintaining or improving performance on high-resolution image tasks.

0 favorites 0 likes

#high-resolution

HiDream-ai/HiDream-O1-Image-Dev

Hugging Face Models Trending ↗ · 5d ago Cached

HiDream-ai has released HiDream-O1-Image-Dev, an 8B parameter open-source image generation model that uses a pixel-level unified transformer without external VAEs. It ranks #8 in the Artificial Analysis Text to Image Arena and supports high-resolution generation up to 2,048x2,048.

0 favorites 0 likes

#high-resolution

HiDream-ai/HiDream-O1-Image

Hugging Face Models Trending ↗ · 5d ago Cached

HiDream-ai has open-sourced HiDream-O1-Image (8B), a unified image generative foundation model built on a Pixel-level Unified Transformer (UiT) that natively handles text-to-image, image editing, and subject-driven personalization at up to 2048×2048 resolution without external VAEs or disjoint text encoders. It debuted at #8 in the Artificial Analysis Text to Image Arena and is positioned as a leading open-weights text-to-image model.

0 favorites 0 likes

#high-resolution

SwiftI2V: Efficient High-Resolution Image-to-Video Generation via Conditional Segment-wise Generation

Hugging Face Daily Papers ↗ · 6d ago Cached

SwiftI2V is a new efficient framework for high-resolution image-to-video generation that uses conditional segment-wise generation to achieve 2K synthesis with significantly reduced computational costs. It enables practical generation on single consumer or datacenter GPUs while maintaining input fidelity.

0 favorites 0 likes

high-resolution

L2P: Unlocking Latent Potential for Pixel Generation

LLaVA-UHD v4: What Makes Efficient Visual Encoding in MLLMs?

HiDream-ai/HiDream-O1-Image-Dev

HiDream-ai/HiDream-O1-Image

SwiftI2V: Efficient High-Resolution Image-to-Video Generation via Conditional Segment-wise Generation

Submit Feedback