Spectral Gradient Surgery for Domain-Generalizable Dataset Distillation

arXiv cs.LG 05/20/26, 04:00 AM Papers

Summary

This paper introduces Domain Generalizable Dataset Distillation (DGDD), a new problem setting that targets out-of-distribution generalization of distilled datasets, and proposes Spectral Gradient Surgery (SGS) to disentangle class-discriminative and domain-specific information by leveraging cross-domain gradient agreement in the spectral domain.

arXiv:2605.18836v1 Announce Type: new Abstract: Dataset Distillation (DD) synthesizes a compact synthetic dataset that preserves the training utility of a full dataset. However, its standard formulation assumes that test data follow the same distribution as training data, an assumption that rarely holds in practice. A straightforward extension-applying post-hoc Domain Generalization (DG) techniques to distilled data-is ill-suited because existing DG methods rely on the natural diversity of real datasets, which compact synthetic sets inherently lack, while also incurring substantial augmentation overhead that conflicts with the efficiency objective of dataset distillation. To address this limitation, we introduce Domain Generalizable Dataset Distillation (DGDD), a new problem setting that explicitly targets out-of-distribution (OOD) generalization of distilled datasets. We study this problem through a widely adopted DD baseline of Distribution Matching (DM). We attribute the OOD vulnerability of DM to the entanglement of class-discriminative and domain-specific information within the compressed synthetic set, and propose Spectral Gradient Surgery (SGS) to disentangle the two. The key insight of SGS is that cross-domain agreement among domain-wise gradients in the spectral domain reveals which gradient components are shared across source domains-and are therefore class-discriminative-and which are domain-specific. Based on this observation, SGS augments the standard DM update with two complementary gradients: one that reinforces cross-domain shared components and another that explicitly promotes diversity within the distilled dataset. Extensive experiments on diverse-scale benchmarks demonstrate that SGS substantially improves OOD generalization while remaining plug-and-play compatible with existing DM methods.

Original Article

Spectral Gradient Surgery for Domain-Generalizable Dataset Distillation

Similar Articles

Spectral Unforgetting: Post-Hoc Recovery of Damaged Capabilities Without Retraining

GDSD: Reinforcement Learning as Guided Denoiser Self-Distillation for Diffusion Language Models

Self-Distilled Policy Gradient

SG-OPD: Sign-Gated On-Policy Distillation via Sign-Consistency Gating and Phased Teacher Sampling

Diffusion Model as a Generalist Segmentation Learner

Submit Feedback

Similar Articles

Spectral Unforgetting: Post-Hoc Recovery of Damaged Capabilities Without Retraining

GDSD: Reinforcement Learning as Guided Denoiser Self-Distillation for Diffusion Language Models

Self-Distilled Policy Gradient

SG-OPD: Sign-Gated On-Policy Distillation via Sign-Consistency Gating and Phased Teacher Sampling

Diffusion Model as a Generalist Segmentation Learner