Representation Fréchet Loss for Visual Generation

Papers with Code Trending Papers

Summary

This paper introduces FD-loss, a method to optimize Fréchet Distance as a training objective for visual generation by decoupling population and batch sizes. It demonstrates that this approach improves generator quality and suggests FID may not always accurately reflect visual quality.

We show that Fréchet Distance (FD), long considered impractical as a training objective, can in fact be effectively optimized in the representation space. Our idea is simple: decouple the population size for FD estimation (e.g., 50k) from the batch size for gradient computation (e.g., 1024). We term this approach FD-loss. Optimizing FD-loss reveals several surprising findings. First, post-training a base generator with FD-loss in different representation spaces consistently improves visual quality. Under the Inception feature space, a one-step generator achieves0.72 FID on ImageNet 256x256. Second, the same FD-loss repurposes multi-step generators into strong one-step generators without teacher distillation, adversarial training or per-sample targets. Third, FID can misrank visual quality: modern representations can yield better samples despite worse Inception FID. This motivates FDr^k, a multi-representation metric. We hope this work will encourage further exploration of distributional distances in diverse representation spaces as both training objectives and evaluation metrics for generative models.
Original Article
View Cached Full Text

Cached at: 05/08/26, 08:48 AM

Paper page - Representation Fréchet Loss for Visual Generation

Source: https://huggingface.co/papers/2604.28190

Abstract

Fréchet Distance can be effectively optimized as a training objective when decoupling population size from batch size, leading to improved generator quality and alternative evaluation metrics.

We show thatFréchet Distance(FD), long considered impractical as a training objective, can in fact be effectively optimized in therepresentation space. Our idea is simple: decouple the population size for FD estimation (e.g., 50k) from the batch size for gradient computation (e.g., 1024). We term this approachFD-loss. OptimizingFD-lossreveals several surprising findings. First, post-training a base generator withFD-lossin differentrepresentation spaces consistently improves visual quality. Under theInception feature space, a one-step generator achieves0.72FIDon ImageNet 256x256. Second, the sameFD-lossrepurposesmulti-step generatorsinto strongone-step generatorswithout teacher distillation, adversarial training or per-sample targets. Third,FIDcan misrank visual quality: modern representations can yield better samples despite worse InceptionFID. This motivates FDr^k, a multi-representation metric. We hope this work will encourage further exploration ofdistributional distancesin diverserepresentation spaces as both training objectives and evaluation metrics for generative models.

View arXiv pageView PDFGitHub430Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, orclicking here.

Tap or paste here to upload images

Get this paper in your agent:

hf papers read 2604\.28190

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper1

#### jjiaweiyang/FD-Loss

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2604.28190 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2604.28190 in a Space README.md to link it from this page.

Collections including this paper5

Browse 5 collections that include this paper

Similar Articles

FFJORD: Free-form continuous dynamics for scalable reversible generative models

OpenAI Blog

FFJORD introduces a scalable reversible generative model using continuous dynamics and Hutchinson's trace estimator to enable unbiased log-density estimation without architectural constraints. The method achieves state-of-the-art results on density estimation and image generation while maintaining efficient sampling.

Improved Techniques for Training Consistency Models

OpenAI Blog

OpenAI presents improved techniques for training consistency models that enable high-quality single-step image generation without distillation, achieving significant FID improvements on CIFAR-10 and ImageNet 64×64 through novel loss functions and training strategies.