Representation Fréchet Loss for Visual Generation
Summary
This paper introduces FD-loss, a method to optimize Fréchet Distance as a training objective for visual generation by decoupling population and batch sizes. It demonstrates that this approach improves generator quality and suggests FID may not always accurately reflect visual quality.
View Cached Full Text
Cached at: 05/08/26, 08:48 AM
Paper page - Representation Fréchet Loss for Visual Generation
Source: https://huggingface.co/papers/2604.28190
Abstract
Fréchet Distance can be effectively optimized as a training objective when decoupling population size from batch size, leading to improved generator quality and alternative evaluation metrics.
We show thatFréchet Distance(FD), long considered impractical as a training objective, can in fact be effectively optimized in therepresentation space. Our idea is simple: decouple the population size for FD estimation (e.g., 50k) from the batch size for gradient computation (e.g., 1024). We term this approachFD-loss. OptimizingFD-lossreveals several surprising findings. First, post-training a base generator withFD-lossin differentrepresentation spaces consistently improves visual quality. Under theInception feature space, a one-step generator achieves0.72FIDon ImageNet 256x256. Second, the sameFD-lossrepurposesmulti-step generatorsinto strongone-step generatorswithout teacher distillation, adversarial training or per-sample targets. Third,FIDcan misrank visual quality: modern representations can yield better samples despite worse InceptionFID. This motivates FDr^k, a multi-representation metric. We hope this work will encourage further exploration ofdistributional distancesin diverserepresentation spaces as both training objectives and evaluation metrics for generative models.
View arXiv pageView PDFGitHub430Add to collection
Community
Upload images, audio, and videos by dragging in the text input, pasting, orclicking here.
Tap or paste here to upload images
Get this paper in your agent:
hf papers read 2604\.28190
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper1
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2604.28190 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2604.28190 in a Space README.md to link it from this page.
Collections including this paper5
Similar Articles
The FID Lottery: Quantifying Hidden Randomness in Generative-Model Evaluation
This paper analyzes the variance of FID scores across different training and sampling seeds, revealing significant reproducibility issues in image generation evaluation. It proposes a new evaluation protocol with error bars and per-cell optimal guidance tuning.
$f$-Trajectory Balance: A Loss Family for Tuning GFlowNets, Generative Models, and LLMs with Off- and On-Policy Data
This paper introduces a family of loss functions derived from f-divergences for training generative models like GFlowNets and LLMs, which are valid off-policy while matching on-policy gradients of the corresponding f-divergence. Applications include molecule discovery and asynchronous LLM tuning.
FFJORD: Free-form continuous dynamics for scalable reversible generative models
FFJORD introduces a scalable reversible generative model using continuous dynamics and Hutchinson's trace estimator to enable unbiased log-density estimation without architectural constraints. The method achieves state-of-the-art results on density estimation and image generation while maintaining efficient sampling.
Flow-Direct: Feedback-Efficient and Reusable Guidance for Flow Models via Non-Parametric Guidance Field
Flow-Direct introduces a non-parametric guidance field for flow-based generative models that accumulates reward feedback persistently, improving feedback efficiency and enabling reuse of collected samples to guide generation for multiple objectives without additional reward evaluations.
Improved Techniques for Training Consistency Models
OpenAI presents improved techniques for training consistency models that enable high-quality single-step image generation without distillation, achieving significant FID improvements on CIFAR-10 and ImageNet 64×64 through novel loss functions and training strategies.